Commit Graph

378 Commits

Author SHA1 Message Date
Akim Demaille
6d86d26b33 tests: factor iterating over skeletons
* tests/local.at (AT_FOR_EACH_SKEL): New.
Use where appropriate.
* data/skeletons/lalr1.d: Reject -d.
* tests/input.at, tests/scanner.at: Also check D.
2021-08-07 12:53:19 +02:00
Akim Demaille
80db1029e6 m4: catch suspicions of unevaluated macros
Check in m4's output if there are sequences such as m4_foo or b4_foo,
which are probably resulting from incorrect m4 processing.

It actually already is useful:

- it caught a leaking b4_lac_if leaking from glr.c, where LAC is not
  supported, hence b4_lac_if is not defined.

- it also caught references to location.hh in position.hh when
  location.hh does not exist.

- while making "Code injection" robust to these new warnings (it is
  its very purpose to let b4_canary pass unevaluated), I saw that it
  did not check lalr1.d, and when adding lalr1.d, it revealed it did
  underquote ocurrences of token value types.

* src/scan-skel.l (macro): New abbreviation.
Use it.
* data/skeletons/glr.c: Don't use b4_lac_if, we don't have it.
* data/skeletons/location.cc: Don't generate position.hh when we don't
generate location.hh.
* data/skeletons/d.m4 (b4_basic_symbol_constructor_define): Fix
underquotation.
* data/skeletons/bison.m4 (b4_canary): New.
* tests/input.at (Code injection): Use it, and check lalr1.d too.
2021-08-07 12:53:19 +02:00
Akim Demaille
952479fca7 scan: fix typo in UTF-8 escape
We had:

```
-mbchar    ...|\xF0[\x\90-\xBF]([\x80-\xBF]{2})|...
+mbchar    ...|\xF0[\x90-\xBF]([\x80-\xBF]{2})|...
```

so a precise sequence that matches the incorrect regex can let NUL
bytes pass through, which triggers an assertion violation downstream.
It is a pity that Flex does not report an error for such input.

Reported by Ahcheong Lee <ahcheong.lee@gmail.com>.
<https://lists.gnu.org/r/bug-bison/2021-04/msg00003.html>

* src/scan-gram.l (mbchar): Fix the bad regex.
* tests/input.at (Invalid inputs): Check that case.
2021-08-03 12:22:52 +02:00
Paul Eggert
b4582f1918 Update URLs to prefer https: to http:
Also, fix a few http: URLs that were no longer working.
2021-01-29 13:48:43 -08:00
Akim Demaille
d7e8aaa271 package: bump copyrights to 2021
Run 'make update-copyright'.
2021-01-16 16:11:17 +01:00
Akim Demaille
5b19f91ccf multistart: check duplicates
* src/symlist.h, src/symlist.c (symbol_list_find_symbol)
(symbol_list_last): New.
(symbol_list_append): Use symbol_list_last.
* src/reader.c (grammar_start_symbols_add): Check and discard duplicates.
* tests/input.at (Duplicate %start symbol): New.
* tests/reduce.at (Bad start symbols): Add the multistart keyword.
2020-11-30 16:48:03 +01:00
Akim Demaille
829ac9caed java: lac: more tests, and some doc
* doc/bison.texi: C++ and Java support LAC.
* tests/input.at (LAC: Errors for %define): Generalize the test, and
apply it to Java.
2020-11-04 07:28:13 +01:00
Akim Demaille
b327f38832 deprecate %defines in favor of %header
This is consistent with --defines being deprecated in favor of
--header.  The directive %defines is also too similar to %define.
And %header matches nicely with api.header.name.

* src/scan-gram.l (%defines): Deprecate to %header.
(%header): Scan it.
* src/parse-gram.y (PERCENT_DEFINES): Replace with...
(PERCENT_HEADER): this.
* data/skeletons/lalr1.java
* doc/bison.texi
* tests/actions.at, tests/c++.at, tests/calc.at, tests/conflicts.at,
* tests/input.at, tests/java.at, tests/local.at, tests/output.at,
* tests/synclines.at, tests/types.at:
Convert most tests to check %header instead of %defines.
2020-09-19 17:49:03 +02:00
Valentin Tolmer
ef09bf065a glr2.cc: fork glr.cc to a c++ version
This is a fork of glr.cc to be c++-first instead of a wrapper around
glr.c.

* data/skeletons/glr2.cc: New.
* data/skeletons/bison.m4, data/skeletons/c++.m4: Adjust.
* data/skeletons/c.m4 (b4_user_args_no_comma): New.
* src/reader.c (grammar_rule_check_and_complete): glr2.cc is C++.
* tests/actions.at, tests/c++.at, tests/calc.at, tests/conflicts.at,
* tests/input.at, tests/local.at, tests/regression.at, tests/scanner.at,
* tests/synclines.at, tests/types.at: Also check glr2.cc.
2020-08-30 10:45:21 +02:00
Akim Demaille
b801b7b670 fix: unterminated \-escape
An assertion failed when the last character is a '\' and we're in a
character or a string.
Reported by Agency for Defense Development.
https://lists.gnu.org/r/bug-bison/2020-08/msg00009.html

* src/scan-gram.l: Catch unterminated escapes.
* tests/input.at (Unexpected end of file): New.
2020-08-08 07:53:33 +02:00
Akim Demaille
b7aab2dbad fix: crash when redefining the EOF token
Reported by Agency for Defense Development.
https://lists.gnu.org/r/bug-bison/2020-08/msg00008.html

On an empty such as

    %token FOO
           BAR
           FOO 0
    %%
    input: %empty

we crash because when we find FOO 0, we decrement ntokens (since FOO
was discovered to be EOF, which is already known to be a token, so we
increment ntokens for it, and need to cancel this).  This "works well"
when EOF is properly defined in one go, but here it is first defined
and later only assign token code 0.  In the meanwhile BAR was given
the token number that we just decremented.

To fix this, assign symbol numbers after parsing, not during parsing,
so that we also saw all the explicit token codes.  To maintain the
current numbers (I'd like to keep no difference in the output, not
just equivalence), we need to make sure the symbols are numbered in
the same order: that of appearance in the source file.  So we need the
locations to be correct, which was almost the case, except for nterms
that appeared several times as LHS (i.e., several times as "foo:
...").  Fixing the use of location_of_lhs sufficed (it appears it was
intended for this use, but its implementation was unfinished: it was
always set to "false" only).

* src/symtab.c (symbol_location_as_lhs_set): Update location_of_lhs.
(symbol_code_set): Remove broken hack that decremented ntokens.
(symbol_class_set, dummy_symbol_get): Don't set number, ntokens and
nnterms.
(symbol_check_defined): Do it.
(symbols): Don't count nsyms here.
Actually, don't count nsyms at all: let it be done in...
* src/reader.c (check_and_convert_grammar): here.  Define nsyms from
ntokens and nnterms after parsing.
* tests/input.at (EOF redeclared): New.

* examples/c/bistromathic/bistromathic.test: Adjust the traces: in
"%nterm <double> exp %% input: ...", exp used to be numbered before
input.
2020-08-07 07:30:06 +02:00
Akim Demaille
cb65553449 diagnostics: better location for type redeclarations
From

    foo.y:1.7-11: error: %type redeclaration for bar
        1 | %type <foo> bar bar
          |       ^~~~~
    foo.y:1.7-11: note: previous declaration
        1 | %type <foo> bar bar
          |       ^~~~~

to

    foo.y:1.17-19: error: %type redeclaration for bar
        1 | %type <foo> bar bar
          |                 ^~~
    foo.y:1.13-15: note: previous declaration
        1 | %type <foo> bar bar
          |             ^~~

* src/symlist.h, src/symlist.c (symbol_list_type_set): There's no need
for the tag's location, use that of the symbol.
* src/parse-gram.y: Adjust.
* tests/input.at: Adjust.
2020-08-01 08:54:46 +02:00
Akim Demaille
be95a4fe29 scanner: don't crash on strings containing a NUL byte
We crash if the input contains a string containing a NUL byte.
Reported by Suhwan Song.
https://lists.gnu.org/r/bug-bison/2020-07/msg00051.html

* src/flex-scanner.h (STRING_FREE): Avoid accidental use of
last_string.
* src/scan-gram.l: Don't call STRING_FREE without calling
STRING_FINISH first.
* tests/input.at (Invalid inputs): Check that case.
2020-07-28 19:01:48 +02:00
Akim Demaille
6b78e50cef cex: make "rerun with '-Wcex'" a note instead of a warning
Currently the suggestion to rerun is a -Wother warning:

    warning: 2 shift/reduce conflicts [-Wconflicts-sr]
    warning: rerun with option '-Wcounterexamples' to generate conflict counterexamples [-Wother]

Instead, let's attach it as a subnote of the diagnostic (in the
current case, -Wconflicts-sr):

    warning: 2 shift/reduce conflicts [-Wconflicts-sr]
    note: rerun with option '-Wcounterexamples' to generate conflict counterexamples

* src/conflicts.c (conflicts_print): Do that.
Adjust the test suite.
2020-07-21 18:57:56 +02:00
Akim Demaille
eeafc706e8 c++: by default, use const std::string for file names
Reported by Martin Blais and Yuriy Solodkyy.
https://lists.gnu.org/r/help-bison/2020-05/msg00011.html
https://lists.gnu.org/r/bug-bison/2020-06/msg00038.html

While at it, modernize filename_type as api.filename.type and document
it properly.

* data/skeletons/c++.m4 (filename_type): Rename as...
(api.filename.type): this.
Default to const std::string.
* data/skeletons/location.cc (position, location): Expose the
filename_type type.
Use api.filename.type.
* doc/bison.texi (%define Summary): Document api.filename.type.
(C++ Location Values): Document position::filename_type.
* src/muscle-tab.c (muscle_percent_variable_update): Ensure backward
compatibility.
* tests/c++.at: Check that using const file names is ok.
tests/input.at: Check backward compat.
2020-06-27 10:06:00 +02:00
Akim Demaille
c857ed4f72 style: prefer 'FOO ()' to 'FOO' for function-like macros
* src/flex-scanner.h (STRING_GROW, STRING_FINISH, STRING_FREE):
Make them function-like macros.
Adjust dependencies.
2020-06-13 15:20:56 +02:00
Akim Demaille
b0bb4cde2e cex: suggest -Wcounterexamples when there are unexpected conflicts
Suggesting -Wcounterexamples when there are conflicts is probably not
what the user wants.  If she knows her conflicts and has set
%expect/%expect-rr appropriately, we shouldn't warn.

The commit also swaps the counterexamples and the report of conflicts,
into, IMHO, a more natural order: from

    Shift/reduce conflict on token B:
    1:    3 a: A .
    1:    8 y: A . B
    Example              A • B C
    First derivation     s ::=[ a ::=[ A • ] x ::=[ B C ] ]
    Example              A • B C
    Second derivation    s ::=[ y ::=[ A • B ] c ::=[ C ] ]

    input.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
    input.y:4.4: warning: rule useless in parser due to conflicts [-Wother]

to

    input.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
    Shift/reduce conflict on token B:
    1:    3 a: A .
    1:    8 y: A . B
    Example              A • B C
    First derivation     s ::=[ a ::=[ A • ] x ::=[ B C ] ]
    Example              A • B C
    Second derivation    s ::=[ y ::=[ A • B ] c ::=[ C ] ]

    input.y:4.4: warning: rule useless in parser due to conflicts [-Wother]

* src/conflicts.c (rule_conflicts_print): Rename as...
(report_rule_expectation_mismatches): this.
Move the handling of report_counterexamples to...
(conflicts_print): Here.
Display this warning when applicable.
2020-06-10 09:51:39 +02:00
Joshua Watt
dd878d1851 bison: add command line option to map file prefixes
Teaches bison about a new command line option, --file-prefix-map OLD=NEW
(based on the -ffile-prefix-map option from GCC) which causes it to
replace and file path of OLD in the text of the output file with NEW,
mainly for header guards and comments. The primary use of this is to
make builds reproducible with different input paths, and in particular
the debugging information produced when the source code is compiled. For
example, a distro may know that the bison source code will be located at
"/usr/src/bison" and thus can generate bison files that are reproducible
with the following command:

    bison --output=/build/bison/parse.c -d --file-prefix-map=/build/bison/=/usr/src/bison/ parse.y

Importantly, this will change the header guards and #line directives
from:

    #ifndef YY_BUILD_BISON_PARSE_H
    #line 100 "/build/bison/parse.h"

to

    #ifndef YY_USR_SRC_BISON_PARSE_H
    #line 100 "/usr/src/bison/parse.h"

which is reproducible.

See https://lists.gnu.org/r/bison-patches/2020-05/msg00016.html
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>

* src/files.h, src/files.c (spec_mapped_header_file)
(mapped_dir_prefix, map_file_name, add_prefix_map): New.
* src/getargs.c (-M, --file-prefix-map): New option.
* src/output.c (prepare): Define b4_mapped_dir_prefix and
b4_spec_header_file.
* src/scan-skel.l (@ofile@): Output the mapped file name.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/location.cc,
* data/skeletons/yacc.c:
Adjust.
* doc/bison.texi: Document.
* tests/input.at, tests/output.at: Check.
2020-05-24 15:17:15 +02:00
Akim Demaille
e7aff57122 style: rename user_token_number as code
This should have been done in 3.6, but I wanted to avoid introducing
conflicts into Vincent's work on counterexamples.  It turns out it's
completely orthogonal.

* data/README.md, data/skeletons/bison.m4, data/skeletons/c++.m4,
* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/java.m4,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/variant.hh, data/skeletons/yacc.c, src/conflicts.c,
* src/derives.c, src/gram.c, src/gram.h, src/output.c,
* src/parse-gram.c, src/parse-gram.y, src/print-xml.c, src/print.c,
* src/reader.c, src/symtab.c, src/symtab.h, tests/input.at,
* tests/types.at:
s/user_token_number/code/g.
Plus minor changes.
2020-05-23 08:43:58 +02:00
Akim Demaille
ff4d67ede8 CI: add GCC 10 and Clang 10
* .travis.yml: Here.
* tests/input.at, tests/regression.at: Beware of clang's -Wdocumentation.
2020-05-17 08:28:12 +02:00
Akim Demaille
465babb635 fix: do not emit nested comments
With input such as

    %token<fl> yVL_CLOCK "/*verilator sc_clock*/"

we generate

    yVL_CLOCK = 610,      /* "/*verilator sc_clock*/"  */

which is invalid since the comment will actually be closed on the
first "*/".  Let's turn "*/" into "*\/" to avoid this.  But GCC will
also warn about "/*" inside a comment, so let's "escape" it too.

Reported by Huang Rui.
https://github.com/akimd/bison/issues/38

* data/skeletons/c-like.m4 (_b4_comment): Escape comment delimiters in
comments.
* tests/input.at (Torturing the Scanner): Check thes cases.
* tests/m4.at: New.
2020-05-17 08:28:12 +02:00
Akim Demaille
cd4e799da4 error: rename the error token from YYERRCODE to YYerror
See https://lists.gnu.org/r/bison-patches/2020-04/msg00162.html.

* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.java, doc/bison.texi,
* examples/c/bistromathic/parse.y, src/scan-gram.l, src/symtab.c
(YYERRCODE): Rename as...
(YYerror): this.
Adjust dependencies.
2020-04-28 07:54:07 +02:00
Akim Demaille
7346163840 dogfooding: use YYERRCODE in our scanner
* src/scan-gram.l: Use it.
* tests/input.at: Adjust.
2020-04-27 08:21:50 +02:00
Akim Demaille
89c4e1becf scanner: avoid spurious errors about empty character literals
On an invalid character literal such as "'\777'" we used to produce
two errors:

    input.y:2.9-12: error: invalid number after \-escape: 777
    input.y:2.8-13: error: empty character literal

Get rid of the second one.

* src/scan-gram.l (STRING_GROW_ESCAPE): New.
* tests/input.at: Adjust.
2020-04-27 08:06:49 +02:00
Akim Demaille
3262747c5b scanner: bad character literals are errors
* src/scan-gram.l: These are errors, not warnings.
* tests/input.at: Adjust.
2020-04-27 07:17:04 +02:00
Akim Demaille
286d0755f8 all: prefer YYERRCODE to YYERROR
We will not keep YYERRCODE anyway, it causes backward compatibility
issues.  So as a first step, let all the skeletons use that name,
until we have a better one.

* data/skeletons/bison.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c, doc/bison.texi, tests/headers.at,
* tests/input.at:
here.
2020-04-26 15:09:51 +02:00
Akim Demaille
ffa46e6516 skeletons: clarify the tag of special tokens
From

    GRAM_EOF = 0,                  /* $end  */
    GRAM_ERRCODE = 1,              /* error  */
    GRAM_UNDEF = 2,                /* $undefined  */

to

    GRAM_EOF = 0,                  /* "end of file"  */
    GRAM_ERRCODE = 1,              /* error  */
    GRAM_UNDEF = 2,                /* "invalid token"  */

* src/output.c (symbol_tag): New.
Use it to pass the token names and the symbol tags to the skeletons.

* tests/input.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
a555b41990 diagnostics: replace "user token number" by "token code"
Yet, don't change the structure identifier to avoid introducing
conflicts in Vincent Imbimbo's PR (which, amusingly enough, is about
conflicts).

* src/symtab.c: here.
* tests/diagnostics.at, tests/input.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
e50de09886 tokens: properly define the YYEOF token kind
Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file.  As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.

Give the $end token a visible kind name, YYEOF.  Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.

* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
2020-04-12 13:56:44 +02:00
Akim Demaille
95421df67b tokens: define the "$undefined" token kind
* data/skeletons/bison.m4 (b4_symbol_token_kind): Give a definition to
$undefined.
(b4_token_visible_if): $undefined has an id.
* src/output.c (prepare_symbol_definitions): Stop lying: $undefined
_is_ a token.
* tests/input.at: Adjust.
2020-04-12 13:56:43 +02:00
Akim Demaille
a4ed94bc13 tokens: properly define the "error" token kind
There are people out there that do use YYERRCODE (the token kind of
the error token).  See for instance
3812012bb7/unixODBC-2.3.2/Drivers/nn/yylex.c.

Currently, YYERRCODE is defined by yacc.c in an adhoc way as a #define
in the *.c file only.  It belongs with the other token kinds.

YYERRCODE is not a nice name, it does not fit in our naming scheme.
YYERROR would be more logical, but it collides with the YYERROR macro.
Shall we keep the same name in all the skeletons?  Besides, to avoid
collisions in C, we need to apply the api prefix: YYERRCODE is
actually <PREFIX>ERRCODE.  This is not needed in the other languages.

* data/skeletons/bison.m4 (b4_symbol_token_kind): New.
Map the error token to "YYERRCODE".
* data/skeletons/yacc.c (YYERRCODE): Don't define it, it's handled by...
* src/output.c (prepare_symbol_definitions): this.
* tests/input.at (Redefining the error token): Check it.
2020-04-12 13:56:43 +02:00
Akim Demaille
951da960e6 merge branch 'maint'
* upstream/maint:
  maint: post-release administrivia
  version 3.5.3
  news: update for 3.5.3
  yacc.c: make sure we properly propagated the user's number for error
  diagnostics: don't crash because of repeated definitions of error
  style: initialize some struct members
  diagnostics: beware of zero-width characters
  diagnostics: be sure to close the styling when lines are too short
  muscles: fix incorrect decoding of $
  code: be robust to reference with invalid tags
  build: fix typo
  doc: update recommandation for libtextstyle
  style: comment changes
  examples: use consistently the GFDL header for readmes
  style: remove useless declarations
  typo: succesful -> successful
  README: point to tests/bison, and document --trace
  gnulib: update
  maint: post-release administrivia
2020-03-08 10:13:16 +01:00
Akim Demaille
e3812bb8c3 yacc.c: make sure we properly propagated the user's number for error
* data/skeletons/yacc.c (YYERRCODE): Be truthful.
* tests/input.at (Redefining the error token): Check that.
2020-03-08 08:10:11 +01:00
Akim Demaille
cfcd823e16 diagnostics: don't crash because of repeated definitions of error
According to https://www.unix.com/man-page/POSIX/1posix/yacc/, the
user is allowed to specify her user number for the error token:

    The token error shall be reserved for error handling. The name
    error can be used in grammar rules. It indicates places where the
    parser can recover from a syntax error. The default value of error
    shall be 256. Its value can be changed using a %token
    declaration. The lexical analyzer should not return the value of
    error.

I think this feature is useless, the user should not have to deal with
that.  The intend is probably to give the user a means to use 256 if
she wants to, but provided "error" cleared the path first by being
assigned another number.  In the case of Bison, 256 is assigned to
"error" at the end if the user did not use it for a token of hers.  So
this feature is useless.

Yet it is valid, and if the user assigns twice a token number to
"error", then the second time we want to complain about it and want to
show the original definition.  At this point, we try to display the
built-in definition of "error", whose location is NULL, and we crash.

Rather, the location of the first user definition of "error" should
become its defining location.

Reported byg Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00007.html

* src/symtab.c (symbol_class_set): If this is a declaration and the
symbol was not declared yet, keep this as defining location.
* tests/input.at (Redefining the error token): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
b82b387da9 muscles: fix incorrect decoding of $
Bug introduced in 458171e6df.
https://lists.gnu.org/archive/html/bison-patches/2013-11/msg00009.html

Reported by Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00010.html

* src/muscle-tab.c (COMMON_DECODE): "$" is coded as "$][", not "$[][".
* tests/input.at ("%define" enum variables): Check that case.
2020-03-07 07:45:10 +01:00
Akim Demaille
641e326303 code: be robust to reference with invalid tags
Because we want to support $<a->b>$, we must accept -> in type tags,
and reject $<->$, as it is unfinished.
Reported by Ahcheong Lee.

* src/scan-code.l (yylex): Make sure "tag" does not end with -, since
-> does not close the tag.
* tests/input.at (Stray $ or @): Check this.
2020-03-06 17:29:26 +01:00
Victor Morales Cayuela
e09a72eeb0 diagnostics: modernize the display of submessages
Since Bison 2.7, output was indented four spaces for explanatory
statements.  For example:

    input.y:2.7-13: error: %type redeclaration for exp
    input.y:1.7-11:     previous declaration

Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:

    input.y:2.7-13: error: %type redeclaration for exp
        2 | %type <float> exp
          |       ^~~~~~~
    input.y:1.7-11: note: previous declaration
        1 | %type <int> exp
          |       ^~~~~

* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
2020-02-15 08:28:40 +01:00
Akim Demaille
77bdcc6f0c parse.error: document and diagnose the incompatibility with %token-table
* doc/bison.texi (Tokens from Literals): Move to code using
%token-table to...
(Decl Summary: %token-table): here.
* data/skeletons/bison.m4: Implement mutual exclusion.
* tests/input.at: Check it.
* doc/local.mk: Be robust to the removal of doc/.
2020-02-10 20:15:46 +01:00
Akim Demaille
fc2191f137 diagnostics: modernize bison's syntax errors
We used to display the unexpected token first:

    $ bison foo.y
    foo.y:1.8-13: error: syntax error, unexpected %token, expecting character literal or identifier or <tag>
        1 | %token %token
          |        ^~~~~~

GCC uses a different format:

    $ gcc-mp-9 foo.c
    foo.c:1:5: error: expected identifier or '(' before ')' token
        1 | int()()()
          |     ^

and so does Clang:

    $ clang-mp-9.0 foo.c
    foo.c:1:5: error: expected identifier or '('
    int()()()
        ^
    1 error generated.

They display the unexpected token last (or not at all).  Also, they
don't waste width with "syntax error".  Let's try that.  It gives, for
the same example as above:

    $ bison foo.y
    foo.y:1.8-13: error: expected character literal or identifier or <tag> before %token
        1 | %token %token
          |        ^~~~~~

* src/complain.h, src/complain.c (syntax_error): New.
* src/parse-gram.y (yyreport_syntax_error): Use it.
2020-01-23 08:30:28 +01:00
Akim Demaille
2cc361387c diagnostics: translate bison's own tokens
As a test case, support translations in Bison itself.

* src/parse-gram.y: Mark the translatable tokens.
While at it, use clearer names.
* tests/input.at: Adjust expectations.
2020-01-23 08:26:28 +01:00
Adrian Vogelsgesang
4ab2cf7450 larlr1.cc: Reject unsupported values for parse.lac
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.

* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
2020-01-21 06:57:21 +01:00
Adrian Vogelsgesang
172f103c1e larlr1.cc: Reject unsupported values for parse.lac
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.

* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
2020-01-21 06:22:27 +01:00
Akim Demaille
c67daa9a97 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-10 19:16:23 +01:00
Akim Demaille
8036635251 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-05 10:26:35 +01:00
Akim Demaille
8a910107b3 diagnostics: complain about undeclared string tokens
String literals, which allow for better error messages, are (too)
liberally accepted by Bison, which might result in silent errors.  For
instance

    %type <exVal> cond "condition"

does not define “condition” as a string alias to 'cond' (nonterminal
symbols do not have string aliases).  It is rather equivalent to

    %nterm <exVal> cond
    %token <exVal> "condition"

i.e., it gives the type 'exVal' to the "condition" token, which was
clearly not the intention.

Introduce -Wdangling-alias to catch this.

* src/complain.h, src/complain.c: Add support for -Wdangling-alias.
(argmatch_warning_args): Sort.
* src/symtab.c (symbol_check_defined): Complain about dangling
aliases.
* doc/bison.texi: Document it.
* tests/input.at (Dangling aliases): New test.
2019-11-17 18:27:42 +01:00
Akim Demaille
28d1ca8f48 diagnostics: yacc reserves %type to nonterminals
On

    %token TOKEN1
    %type  <ival> TOKEN1 TOKEN2 't'
    %token TOKEN2
    %%
    expr:

bison -Wyacc gives

    input.y:2.15-20: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |               ^~~~~~
    input.y:2.29-31: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |                             ^~~
    input.y:2.22-27: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |                      ^~~~~~

The messages appear to be out of order, but they are emitted when the
error is found.

* src/symtab.h (symbol_class): Add pct_type_sym, used to denote
symbols appearing in %type.
* src/symtab.c (complain_pct_type_on_token): New.
(symbol_class_set): Check that %type is not applied to tokens.
(symbol_check_defined): pct_type_sym also means undefined.
* src/parse-gram.y (symbol_decl.1): Set the class to pct_type_sym.
* src/reader.c (grammar_current_rule_begin): pct_type_sym also means
undefined.
* tests/input.at (Yacc's %type): New.
2019-11-17 09:45:25 +01:00
Paul Eggert
052215a138 bison: check for int overflow when scanning
* src/scan-gram.l: Include errno.h, for errno.
(scan_integer, handle_syncline): Check for integer overflow.
* tests/input.at (too-large.y): Adjust to match new diagnostics.
2019-10-17 11:51:20 -07:00
Akim Demaille
8631f35bf9 tests: factor the generation of files without the final eol
AFAICT Autotest 2.69 still does not support AT_DATA without the final
eol.

* tests/local.at (AT_DATA_NO_FINAL_EOL): New.
* tests/input.at: Use it.
2019-10-13 09:55:44 +02:00
Akim Demaille
c483b6593f tests: refactor the handling of Perl
Let's make a difference between places where Perl is required for the
test (AT_PERL_REQUIRE), and the places where it's used to run the
test, but it's not not to run the test (AT_PERL_CHECK).

* tests/local.at (AT_REQUIRE): New.
(AT_PERL_CHECK, AT_PERL_REQUIRE): New.
Use them where appropriate.

* tests/local.mk ($(TESTSUITE)): Beware not to start the line with
'-pi' if Perl is empty, as Make understands this as "it's ok to fail".
Which it is not.
2019-10-13 09:22:05 +02:00
Akim Demaille
0c56c195e0 tests: be really robust to Perl missing
My previous tests (with ./configure PERL=false) have been fooled by
configure, that managed to find perl anyway.  This time, I ran this on
a Fedora in Docker, without Perl.

* tests/calc.at, tests/diagnostics.at, tests/headers.at,
* tests/input.at, tests/local.at, tests/named-refs.at,
* tests/output.at, tests/regression.at, tests/skeletons.at,
* tests/synclines.at, tests/torture.at: Don't require Perl.
2019-10-11 06:53:45 +02:00