Commit Graph

1828 Commits

Author SHA1 Message Date
Akim Demaille
ab1208e263 glr: consistently use the same wording in traces
* data/skeletons/glr.c, data/skeletons/glr2.cc (yyglrReduce): Traces
refer to "state 42", not to "state #42".
2021-01-03 08:14:22 +01:00
Akim Demaille
3911aba39a %merge: associate it to its first definition, not the latest
Currently each time we meet %merge we record this location as the
defining location (and symbol).  Instead, record the first definition.

In the generated code we go from

    yy0->A = merge (*yy0, *yy1);

to

    yy0->S = merge (*yy0, *yy1);

where S was indeed the first symbol, and in the diagnostics we go from

    glr-regr18.y:30.18-24: error: result type clash on merge function 'merge': <type2> != <type1>
       30 | sym2: sym3 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:29.18-24: note: previous declaration
       29 | sym1: sym2 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:31.13-19: error: result type clash on merge function 'merge': <type3> != <type2>
       31 | sym3: %merge<merge> { $$ = 0; } ;
          |             ^~~~~~~
    glr-regr18.y:30.18-24: note: previous declaration
       30 | sym2: sym3 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~

to

    glr-regr18.y:30.18-24: error: result type clash on merge function 'merge': <type2> != <type1>
       30 | sym2: sym3 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:29.18-24: note: previous declaration
       29 | sym1: sym2 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:31.13-19: error: result type clash on merge function 'merge': <type3> != <type1>
       31 | sym3: %merge<merge> { $$ = 0; } ;
          |             ^~~~~~~
    glr-regr18.y:29.18-24: note: previous declaration
       29 | sym1: sym2 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~

where both duplicates are reported against definition 1, rather than
using definition 1 as a reference when diagnosing about definition 2,
and then 2 as a reference for 3.

* src/reader.c (record_merge_function_type): Keep the first definition.
* tests/glr-regression.at: Adjust.
2020-12-31 08:07:34 +01:00
Akim Demaille
8bc45673d5 %merge: test support for api.value.type=union
* tests/glr-regression.at: here.
2020-12-31 08:07:34 +01:00
Akim Demaille
edfcca8481 %merge: clearer tests on diagnostics
* tests/glr-regression.at: Use caret errors.
2020-12-31 08:07:11 +01:00
Akim Demaille
94701b4e5e style: rename semanticVal as value
* data/skeletons/README-D.txt: Remove, now useless and obsolete.
* data/skeletons/glr2.cc, examples/d/calc/calc.y,
* tests/calc.at, tests/d.at, tests/scanner.at (semanticVal): Replace
with...
(value): this.
2020-12-26 14:26:23 +01:00
Akim Demaille
8db99c54f4 tests: don't require YYSTYPE/YYLTYPE to be defined in C++
* tests/glr-regression.at: Use AT_YYSTYPE/AT_YYLTYPE to generate
yy::parser::value_type and yy::parser::location_type in C++.
2020-12-26 11:55:01 +01:00
Adela Vais
32bb53870b d: remove unnecessary methods from the Lexer interface
The complete symbol approach in yylex removes the need for the methods
semanticVal, startPos and endPos, which were used when the values were
reported separately.

* data/skeletons/lalr1.d: Here.
* doc/bison.texi: Remove sections about the three methods.
* examples/d/calc/calc.y, examples/d/simple/calc.y: Remove the unused methods.
* tests/calc.at, tests/d.at, tests/scanner.at: Test it.
2020-12-21 15:53:32 +01:00
Akim Demaille
c0f3b55b25 style: address syntax-check diagnostics
* examples/c/glr/c++-types.y: Formatting changes.
* po/POTFILES.in: Add missing files.
* src/reader.c: Remove useless include.
* tests/calc.at: Avoid magic values for exit.
Obfuscate calls to error.
2020-12-21 07:51:02 +01:00
Adela Vais
20d657c1dd d: create alias Position for YYPosition
* data/skeletons/d.m4 (b4_public_types_declare): Here.
* data/skeletons/lalr1.d: Adjust.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y: Use it.
* tests/calc.at: Test it.
2020-12-21 07:16:25 +01:00
Adela Vais
b00fa62e95 d: create alias Value for YYSemanticType
* data/skeletons/d.m4: Here.
* data/skeletons/lalr1.d, examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
* tests/calc.at, tests/d.at, tests/scanner.at: Test it.
2020-12-21 07:13:48 +01:00
Adela Vais
8e44b24ba8 d: create alias Location for YYLocation
* data/skeletons/d.m4: Here.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y: Adjust.
* tests/calc.at: Test it.
2020-12-21 07:13:17 +01:00
Adela Vais
cc04459cfe d: reduce verbosity for returning the location from yylex()
* examples/d/calc/calc.y (start, end): Replace by this...
(location): new member variable in the Lexer class.
Use it.
* tests/calc.at: Use the defined location variable.
2020-12-21 06:54:35 +01:00
Adela Vais
13bb2b78b3 d: create alias Symbol for YYParse.Symbol
* data/skeletons/lalr1.d: Here.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
* tests/calc.at, tests/d.at, tests/scanner.at: Test it.
2020-12-21 06:47:57 +01:00
Akim Demaille
fe0102d4d5 glr2.cc: fix GLR stack expansion
When expanding the GLR stack, none of the pointers were updated to
reflect the new location of the displaced objects.

This fixes

    748: Incorrect lookahead during nondeterministic GLR: glr2.cc

* data/skeletons/glr2.cc (yyexpandGLRStack): Update the split point
and the stack tops.
(reduceToOneStack): Factor a bit.
2020-12-20 14:54:47 +01:00
Akim Demaille
b33817f8bb glr: tests: add support for the YYDEBUG envvar
When debugging these parsers, we really need debug traces.
Enable them, and bind them to $YYDEBUG.

* tests/glr-regression.at: Support the YYDEBUG envvar.
As a consequence, now that syntactic ambiguities are reported, adjust
the expected output.
(No users destructors if stack 0 deleted): Don't return 0 on memory
exhaustion, really return the parser's status, and adust expectations.
2020-12-16 07:38:59 +01:00
Akim Demaille
c8006f4637 glr2.cc: fix yycompressStack
Currently, yycompressStack expects the free items to be states only.
That's not the case.

Fixes 712 and 730 pass.  748 still fails, but later and
differently (heap-use-after-free).

* data/skeletons/glr2.cc (glr_stack_item::setState): New.
(glr_stack_item::yycompressStack): Use it.
* tests/glr-regression.at: Adjust.
2020-12-14 06:33:22 +01:00
Akim Demaille
5b65b3d543 glr2.cc: fix pointer arithmethics
A glr_state keeps tracks of its predecessor using an offset relative
to itself (i.e., pointer subtraction).  Unfortunately we sometimes
have to compute offsets for pointers that live in different
containers, in particular in yyfillin.  In that case there is no
reason for the distance between the two objects to be a multiple of
the object size (0x40 on my machine), and the resulting ptrdiff_t may
be "wrong", i.e., it does allow to recover one from the other.  We
cannot use "typed" pointer arithmetics here, the Euclidean division
has it wrong.  So use "plain" char* pointers.

Fixes 718 (Duplicate representation of merged trees: glr2.cc) and
examples/c++/glr/c++-types.

Still XFAIL:

    712: Improper handling of embedded actions and dollar(-N) in GLR parsers: glr2.cc
    730: Incorrectly initialized location for empty right-hand side in GLR: glr2.cc
    748: Incorrect lookahead during nondeterministic GLR: glr2.cc

* data/skeletons/glr2.cc (glr_state::as_pointer_): New.
(glr_state::pred): Use it.
* examples/c++/glr/c++-types.test: The test passes.
* tests/glr-regression.at (Duplicate representation of merged trees:
glr2.cc): Passes.
2020-12-14 06:33:21 +01:00
Akim Demaille
42fe808292 glr2.cc: run the glr-regression tests
When installed on master as of 2020-12-05 (on top of "glr2.cc: fix
when the stack is not expandable", almost all the GLR regression tests
fail (with a SEGV):

    709: Badly Collapsed GLR States: glr2.cc             FAILED (glr-regression.at:130)
    712: Improper handling of embedded actions and dollar(-N) in GLR parsers: glr2.cc FAILED (glr-regression.at:275)
    715: Improper merging of GLR delayed action sets: glr2.cc FAILED (glr-regression.at:404)
    718: Duplicate representation of merged trees: glr2.cc FAILED (glr-regression.at:502)
    721: User destructor for unresolved GLR semantic value: glr2.cc FAILED (glr-regression.at:566)
    724: User destructor after an error during a split parse: glr2.cc FAILED (glr-regression.at:624)
    727: Duplicated user destructor for lookahead: glr2.cc FAILED (glr-regression.at:724)
    730: Incorrectly initialized location for empty right-hand side in GLR: glr2.cc FAILED (glr-regression.at:823)
    733: No users destructors if stack 0 deleted: glr2.cc FAILED (glr-regression.at:911)
    736: Corrupted semantic options if user action cuts parse: glr2.cc FAILED (glr-regression.at:974)
    739: Undesirable destructors if user action cuts parse: glr2.cc FAILED (glr-regression.at:1042)
    742: Leaked semantic values if user action cuts parse: glr2.cc FAILED (glr-regression.at:1173)
    748: Incorrect lookahead during nondeterministic GLR: glr2.cc FAILED (glr-regression.at:1546)
    751: Leaked semantic values when reporting ambiguity: glr2.cc FAILED (glr-regression.at:1639)
    754: Leaked lookahead after nondeterministic parse syntax error: glr2.cc FAILED (glr-regression.at:1710)
    757: Uninitialized location when reporting ambiguity: glr2.cc FAILED (glr-regression.at:1794)
    766: Predicates: glr2.cc                             FAILED (glr-regression.at:2045)

These pass:

    745: Incorrect lookahead during deterministic GLR: glr2.cc ok
    760: Missed %merge type warnings when LHS type is declared later: glr2.cc ok
    763: Ambiguity reports: glr2.cc ok

With Valentin Tolmer's "glr2.cc: Fix memory corruption bug" commit,
these test fail "gracefully":

    712: Improper handling of embedded actions and dollar(-N) in GLR parsers: glr2.cc FAILED (glr-regression.at:268)
    730: Incorrectly initialized location for empty right-hand side in GLR: glr2.cc FAILED (glr-regression.at:816)
    748: Incorrect lookahead during nondeterministic GLR: glr2.cc FAILED (glr-regression.at:1539)

And these do not end:

    709: Badly Collapsed GLR States: glr2.cc             FAILED (glr-regression.at:123)
    715: Improper merging of GLR delayed action sets: glr2.cc FAILED (glr-regression.at:397)
    718: Duplicate representation of merged trees: glr2.cc FAILED (glr-regression.at:495)
    751: Leaked semantic values when reporting ambiguity: glr2.cc FAILED (glr-regression.at:1632)

With "tests: glr2.cc: run the glr-regression tests", none loop, and
709, 715, and 751 pass.  Only 718 still fails.

* tests/glr-regression.at: Run all the tests with glr2.cc.
* tests/local.at (AT_GLR2_CC_IF): New.
2020-12-06 15:11:33 +01:00
Akim Demaille
d4188398f1 glr.c: fix line numbers in logs
* data/skeletons/glr.c (yyglrReduce): Fix line numbers.
* tests/glr-regression.at: Fix expectations.
2020-12-06 13:54:45 +01:00
Akim Demaille
8ab625e517 tests: glr: run the glr regression tests with glr.cc
* tests/glr-regression.at: Adjust the tests to be more independent of
the language, and run them with glr.cc.
Some tests relied on yychar, yylval and yylloc being global variables:
pass arguments instead.
2020-12-05 12:39:59 +01:00
Akim Demaille
edf6aa7d3b tests: glr: prepare for more languages
* tests/glr-regression.at: Wrap each test in an AT_TEST.
Call it with glr.c.
2020-12-05 12:39:59 +01:00
Akim Demaille
4aec17af93 tests: glr: use AT_FULL_COMPILE
* tests/glr-regression.at: Instead of using AT_BISON_CHECK and
AT_COMPILE, use AT_FULL_COMPILE.  This is shorter, and makes it easier
to add support for other programming languages.
2020-12-05 11:22:20 +01:00
Akim Demaille
c98b8856e6 tests: glr: prefer directives to warnings
* tests/glr-regression.at: Use %expect and %expect-rr in the grammar
files, rather than accepting diagnostics.
This will make it easier to support other programming languages.
2020-12-05 11:22:20 +01:00
Akim Demaille
4ec49a8585 tests: factor the access to token kinds
* tests/local.at (AT_BISON_OPTION_PUSHDEFS): Define AT_TOKEN.
(AT_BISON_OPTION_POPDEFS): Undefine it.
* tests/actions.at, tests/c++.at, tests/calc.at: Use AT_TOKEN.
2020-12-05 11:22:20 +01:00
Akim Demaille
627d275a3a tests: formatting changes
* tests/local.at: here.
2020-12-05 11:22:20 +01:00
Akim Demaille
24233748ec tables: avoid warnings and save bits
The yydefgoto table uses -1 as an invalid for an impossible case (we
never use yydefgoto[0], since it corresponds to the reduction to
$accept, which never happens).  Since yydefgoto is a table of state
numbers, this -1 forces a signed type uselessly, which (1) might
trigger compiler warnings when storing a value from yydefgoto into a
state number (nonnegative), and (2) wastes bits which might result in
using a int16 where a uint8 suffices.

Reported by Jot Dot <jotdot@shaw.ca>.
https://lists.gnu.org/r/bug-bison/2020-11/msg00027.html

* src/tables.c (default_goto): Use 0 rather than -1 as invalid value.
* tests/regression.at: Adjust.
2020-12-03 06:21:53 +01:00
Akim Demaille
5b19f91ccf multistart: check duplicates
* src/symlist.h, src/symlist.c (symbol_list_find_symbol)
(symbol_list_last): New.
(symbol_list_append): Use symbol_list_last.
* src/reader.c (grammar_start_symbols_add): Check and discard duplicates.
* tests/input.at (Duplicate %start symbol): New.
* tests/reduce.at (Bad start symbols): Add the multistart keyword.
2020-11-30 16:48:03 +01:00
Adela Vais
e5854bbddd d: change YYLocation's type from class to struct
This avoids heap allocation and gives minimal costs for the
creation and destruction of the YYParser.Symbol struct if
the location tracking is active.

Suggested by H. S. Teoh.

* data/skeletons/lalr1.d: Here.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y: Adjust.
* tests/calc.at: Test it.
2020-11-20 22:09:31 +01:00
Adela Vais
10305f3e94 d: change the return value of yylex from TokenKind to YYParser.Symbol
The complete symbol approach was deemed to be the right approach for Dlang.
Now, the user can return from yylex() an instance of YYParser.Symbol structure,
which binds together the TokenKind, the semantic value and the location. Before,
the last two were reported separately to the parser.
Only the user API is changed, Bisons's internal structure is kept the same.

* data/skeletons/d.m4 (struct YYParser.Symbol): New.
* data/skeletons/lalr1.d: Change the return value.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y, examples/d/simple/calc.y: Demonstrate it.
* tests/calc.at, tests/scanner.at: Test it.
2020-11-20 22:09:31 +01:00
Adela Vais
593724366f d: add support for lookahead correction
When using lookahead correction, the method YYParser.Context.getExpectedTokens
is not annotated with const, because the method calls yylacCheck, which is not
const. Also, because of yylacStack and yylacEstablished, yylacCheck needs to
be called from the context of the parser class, which is sent as parameter to
the Context's constructor.

* data/skeletons/lalr1.d (yylacCheck, yylacEstablish, yylacDiscard,
yylacStack, yylacEstablished): New.
(Context): Use it.
* doc/bison.texi: Document it.
* tests/calc.at: Check it.
2020-11-18 08:14:43 +01:00
Adela Vais
0e51f6146a d: change the name of the custom error message function to reportSyntaxError
Changed from syntax_error to reportSyntaxError to be similar to the Java parser.

* data/skeletons/lalr1.d: Change the function name.
* doc/bison.texi: Document it.
* tests/local.at: Adjust.
2020-11-18 08:14:21 +01:00
Akim Demaille
23472033ee Merge branch 'maint'
* maint:
  c++: shorten the assertions that check whether tokens are correct
  c++: don't glue functions together
  lalr1.cc: YY_ASSERT should use api.prefix
  c++: don't use YY_ASSERT at all if parse.assert is disabled
  c++: style: follow the Bison m4 quoting pattern
  yacc.c: provide the Bison version as an integral macro
  regen
  style: make conversion of version string to int public
  %require: accept version numbers with three parts ("3.7.4")
  yacc.c: fix #definition of YYEMPTY
  gnulib: update
  doc: fix incorrect section title
  doc: minor grammar fixes in counterexamples section
2020-11-13 07:01:19 +01:00
Akim Demaille
8b424b865e lalr1.cc: YY_ASSERT should use api.prefix
Working on the previous commit I realized that YY_ASSERT was used in
the generated headers, so must follow api.prefix to avoid clashes when
multiple C++ parser with variants are used.

Actually many more macros should obey api.prefix (YY_CPLUSPLUS,
YY_COPY, etc.).  There was no complaint so far, so it's not urgent
enough for 3.7.4, but it should be addressed in 3.8.

* data/skeletons/variant.hh (b4_assert): New.
Use it.
* tests/local.at (AT_YYLEX_RETURN): Fix.
* tests/headers.at: Make sure variant-based C++ parsers are checked
too.
This test did find that YY_ASSERT escaped renaming (before the fix in
this commit).
2020-11-13 06:17:52 +01:00
Akim Demaille
7cc9107053 style: enforce java coding style
* tests/scanner.at: here.
2020-11-10 07:55:36 +01:00
Adela Vais
dc15b62a7c d: add the custom error message feature
Parser.Context class returns a const YYLocation, so Lexer's method
yyerror() needs to receive the location as a const parameter.

Internal error reporting flow is changed to be similar to that of
the other skeletons. Before, case YYERRLAB was calling yyerror()
with the result of yysyntax_error() as the string parameter. As the
custom error message lets the user decide if they want to use
yyerror() or not, this flow needed to be changed. Now, case YYERRLAB
calls yyreportSyntaxError(), that builds the error message using
yysyntaxErrorArguments(). Then yyreportSyntaxError() passes the
error message to the user defined syntax_error() in case of a custom
message, or to yyerror() otherwise.

In the tests in tests/calc.at, the order of the tokens needs to be
changed in order of precedence, so that the D program outputs the
expected tokens in the same order as the other parsers.

* data/skeletons/lalr1.d: Add the custom error message feature.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y: Adjust.
* tests/calc.at, tests/local.at: Test it.
2020-11-07 17:03:23 +01:00
Adela Vais
abf5f7f90e d: add yyerrok
In D's case, yyerrok() is a private method of the Parser class.
It can be called directly as `yyerrok()` from the grammar rules section.

* data/skeletons/lalr1.d: Add yyerrok().
* examples/d/calc/calc.y, examples/d/simple/calc.y: Demonstrate yyerrok().
* tests/calc.at: Update D tests to use yyerrok().
2020-11-07 07:14:52 +01:00
Akim Demaille
d49da0101a java: lac: a stronger test for the exploratory stack
* tests/local.at (AT_YYLEX_DEFINE(java)): Fix overquoting issue.
Style changes.
* tests/regression.at (LAC: Exploratory stack): Run for lalr1.java too.
2020-11-06 07:37:15 +01:00
Akim Demaille
829ac9caed java: lac: more tests, and some doc
* doc/bison.texi: C++ and Java support LAC.
* tests/input.at (LAC: Errors for %define): Generalize the test, and
apply it to Java.
2020-11-04 07:28:13 +01:00
Akim Demaille
f7a7310433 java: avoid Integer(String), use parseInt
examples/java/calc/Calc.java:1531: warning: [deprecation] Integer(String) in Integer has been deprecated
          yylval = new Integer(st.sval);
                   ^

* examples/java/calc/Calc.y, examples/java/simple/Calc.y,
* tests/calc.at, tests/scanner.at: Use Integer.parseInt.
2020-11-03 08:46:54 +01:00
Akim Demaille
1995d9e2f2 java: lac: check it
* tests/calc.at: Add tests for LAC in pull and push parsers.
Skip LAC: line from the logs.
* tests/local.at (reportSyntaxError): Output the error message in a
single call, to avoid having the error message on stderr be
interrupted by the debug traces of LAC in getExpectedTokens.
2020-11-03 08:46:54 +01:00
Akim Demaille
787052f074 style: java: more style fixes
* data/skeletons/lalr1.java, tests/java.at: Avoid space before paren.
And use braces as is customary in the Java.
2020-11-03 08:46:54 +01:00
Akim Demaille
36143b5ecc report: put the dot after %empty in items
When printing items, it is clearer to put the dot after %emtpy rather
than before:

     0 $accept: . unit "end of file"
     1 unit: . assignments exp
-    2 assignments: . %empty
+    2 assignments: %empty .
     3            | . assignments assignment

Also, use the Unicode characters if they are supported.

* src/gram.c (item_print): Put the dot after %emtpy.
* tests/conflicts.at, tests/reduce.at, tests/report.at: Adjust.
2020-10-07 06:28:52 +02:00
Adela Vais
7cd195b30a d: change api.token.raw default value to true
Generate consecutive values for enum TokenKind, as D's yylex()
returns TokenKind and collisions can't happen.

* data/skeletons/d.m4: Change default value.
* tests/scanner.at, tests/d.at: Check it.
2020-10-03 17:17:32 +02:00
Akim Demaille
a7daa63cb7 tests: remove useless prefix for EOF in D
* tests/calc.at (CALC_EOF): Rename as...
(EOF): this.
Since there is no risk of a clash with #define EOF here...
2020-09-29 06:49:31 +02:00
Adela Vais
2e5592b3ab d: support api.symbol.prefix and api.token.prefix
The D skeleton was not properly supporting them.

* data/skeletons/d.m4, data/skeletons/lalr1.d: Fix it.
* tests/calc.at: Check it.
2020-09-28 19:27:19 +02:00
Akim Demaille
9b18cac96b multistart: start more thorough testing
* tests/local.at (AT_MULTISTART_IF): New.
* tests/calc.at: Adjust to check multiple start symbols.
* data/skeletons/yacc.c (yy_parse_impl): Fix.
2020-09-28 19:26:53 +02:00
Akim Demaille
c19c1e7ec5 tests: style: reorder the calculator test macros
* tests/local.at (AT_TOKEN_TRANSLATE_IF): New, moved from...
* tests/calc.at: here.
Instead of sorting per feature (main, yylex, calc.y) and then by
language, do the converse, so that C bits are together, etc.
2020-09-27 19:29:29 +02:00
Akim Demaille
15cf0095d3 tests: shorten the generated file
* tests/synclines.at (_AT_SYNCLINES_COMPILE): Pull the comments out.
2020-09-27 19:29:29 +02:00
Akim Demaille
78d0e5e671 multistart: also check the HTML report
We don't actually have checks for HTML, so let's do it for multistart.

* tests/report.at: here.
2020-09-27 09:23:51 +02:00
Akim Demaille
85ccc1bab3 multistart: adjust computation of initial core and adjust reports
Currently the core of the initial state is limited to the single rule
on $accept.

* src/lr0.c (generate_states): There may now be several rules on
$accept.

* src/graphviz.c (conclude_red): Recognize "final" transitions by the
fact that we reduce to "$accept".
* src/print.c (print_reduction): Likewise.
* src/print-xml.c (print_reduction): Likewise.
2020-09-27 09:23:51 +02:00