Commit Graph

7683 Commits

Author SHA1 Message Date
Adela Vais
4855b98554 d: remove the @property attribute from interface Lexer
D best practices is to not use it.

* data/skeletons/lalr1.d: Remove attribute.
2020-10-02 06:35:17 +02:00
Akim Demaille
cd40ec9526 symbols: stop dealing with YYEMPTY as b4_symbol(-2, ...)
* data/skeletons/bison.m4 (b4_symbol): Redirect `b4_symbol(empty,
...)` to `b4_symbol(-2, ...)`.
Change all uses of the latter to the former.
2020-09-29 06:49:31 +02:00
Akim Demaille
a7daa63cb7 tests: remove useless prefix for EOF in D
* tests/calc.at (CALC_EOF): Rename as...
(EOF): this.
Since there is no risk of a clash with #define EOF here...
2020-09-29 06:49:31 +02:00
Adela Vais
2e5592b3ab d: support api.symbol.prefix and api.token.prefix
The D skeleton was not properly supporting them.

* data/skeletons/d.m4, data/skeletons/lalr1.d: Fix it.
* tests/calc.at: Check it.
2020-09-28 19:27:19 +02:00
Akim Demaille
9b18cac96b multistart: start more thorough testing
* tests/local.at (AT_MULTISTART_IF): New.
* tests/calc.at: Adjust to check multiple start symbols.
* data/skeletons/yacc.c (yy_parse_impl): Fix.
2020-09-28 19:26:53 +02:00
Akim Demaille
c19c1e7ec5 tests: style: reorder the calculator test macros
* tests/local.at (AT_TOKEN_TRANSLATE_IF): New, moved from...
* tests/calc.at: here.
Instead of sorting per feature (main, yylex, calc.y) and then by
language, do the converse, so that C bits are together, etc.
2020-09-27 19:29:29 +02:00
Akim Demaille
15cf0095d3 tests: shorten the generated file
* tests/synclines.at (_AT_SYNCLINES_COMPILE): Pull the comments out.
2020-09-27 19:29:29 +02:00
Akim Demaille
3ff7db34a2 gnulib: update 2020-09-27 11:58:28 +02:00
Akim Demaille
d441a34791 multistart: also give access to yynerrs
This is something that has always bothered me: with pure parsers (and
they all should be) the user does not have an (easy) access to yynerrs
at the end of the parse.  In the case of error recovery, that's the
only direct means to know if there were errors.  The usual approach
being having the user maintain a counter incremented each time yyerror
is called.

So here, also capture yynerrs in the return value of the start-symbol
parsing functions.

* data/skeletons/yacc.c (yy_parse_impl_t): New.
(yy_parse_impl): Use it.
(b4_accept): Fill it.
* examples/c/lexcalc/parse.y, examples/c/lexcalc/scan.l: No longer
pass nerrs as lex- and parse-param, just use the resulting yynerrs.
bistromathic and reccalc both demonstrate %param.
2020-09-27 11:58:28 +02:00
Akim Demaille
f4d33ff4b4 yacc.c: also count calls to YYERROR in yynerrs
* data/skeletons/yacc.c: here.
2020-09-27 11:58:27 +02:00
Akim Demaille
683040b324 multistart: allow tokens as start symbols
After all, why not?

* src/reader.c (switching_token): Use symbol_id_get.
(check_start_symbols): Require that the start symbol is a token only
if it's the only one.
* examples/c/lexcalc/parse.y: Let NUM be a start symbol.
2020-09-27 09:44:23 +02:00
Akim Demaille
d9cf99b6a5 multistart: use b4_accept instead of action post-processing
For each start symbol, generate a parsing function with a richer
return value than the usual of yyparse.  Reserve a place for the
returned semantic value, in order to avoid having to pass a pointer as
argument to "return" that value.  This also makes the call to the
parsing function independent of whether a given start-symbol is typed.

For instance, if the grammar file contains:

    %type <int> expression
    %start input expression

(so "input" is valueless) we get

    typedef struct
    {
      int yystatus;
    } yyparse_input_t;

    yyparse_input_t yyparse_input (void);

    typedef struct
    {
      int yyvalue;
      int yystatus;
    } yyparse_expression_t;

    yyparse_expression_t yyparse_expression (void);

This commit also changes the implementation of the parser termination:
when there are multiple start symbols, it is the initial rules that
explicitly YYACCEPT.  They do that after having exported the
start-symbol's value (if it is typed):

  switch (yyn)
    {
  case 1: /* $accept: YY_EXPRESSION expression $end  */
  { ((*yyvalue).TOK_expression) = (yyvsp[-1].TOK_expression); YYACCEPT; }
    break;

  case 2: /* $accept: YY_INPUT input $end  */
  { YYACCEPT; }
    break;

I have tried several ways to deal with termination, and this is the
one that appears the best one to me.  It is also the most natural.

* src/scan-code.h, src/scan-code.l (obstack_for_actions): New.
* src/reader.c (grammar_rule_check_and_complete): Generate the actions
of the rules for each start symbol.

* data/skeletons/bison.m4 (b4_symbol_slot): New, with safer semantics
than type and type_tag.
* data/skeletons/yacc.c (b4_accept): New.
Generates the body of the action of the start rules.
(_b4_declare_sub_yyparse): For each start symbol define a dedicated
return type for its parsing function.
Adjust the declaration of its parsing function.
(_b4_define_sub_yyparse): Adjust the definition of the function.

* examples/c/lexcalc/parse.y: Check the case of valueless symbols.
* examples/c/lexcalc/lexcalc.test: Check start symbols.
2020-09-27 09:44:18 +02:00
Akim Demaille
a6805bb8d9 multistart: adjust reader checks for generated rules
So far we were not checking the generated rule 0 at all.  Now there
can be several of them.  Instead of not checking at all, let's be more
selective on the check to run on them.

* src/reader.c (grammar_rule_check_and_complete): Don't check for
value usage for generated rules, it is ok to have a valued start
symbol, in which case it is ok for the generated rule ("accept: start
$end {}") to not use $1.
(packgram): Call grammar_rule_check_and_complete for all the rules.
2020-09-27 09:23:51 +02:00
Akim Demaille
a0b044186b todo: more 2020-09-27 09:23:51 +02:00
Akim Demaille
01af4ad9c3 multistart: toy with it in lexcalc
* examples/c/lexcalc/parse.y: Define several start symbols.
* examples/c/lexcalc/lexcalc.test: Check support.
2020-09-27 09:23:51 +02:00
Akim Demaille
ed324578a2 multistart: equip yacc.c
* data/skeletons/yacc.c: Add support for multiple start symbols.
2020-09-27 09:23:51 +02:00
Akim Demaille
05d6b54703 multistart: pass the list of start symbols to the backend
* src/output.c (start_symbols_output): New.
(muscles_output): Use it.
2020-09-27 09:23:51 +02:00
Akim Demaille
78d0e5e671 multistart: also check the HTML report
We don't actually have checks for HTML, so let's do it for multistart.

* tests/report.at: here.
2020-09-27 09:23:51 +02:00
Akim Demaille
85ccc1bab3 multistart: adjust computation of initial core and adjust reports
Currently the core of the initial state is limited to the single rule
on $accept.

* src/lr0.c (generate_states): There may now be several rules on
$accept.

* src/graphviz.c (conclude_red): Recognize "final" transitions by the
fact that we reduce to "$accept".
* src/print.c (print_reduction): Likewise.
* src/print-xml.c (print_reduction): Likewise.
2020-09-27 09:23:51 +02:00
Akim Demaille
4646be7db4 regen 2020-09-27 09:23:51 +02:00
Akim Demaille
8eaddf326b multistart: turn start symbols into rules on $accept
Now that the parser can read several start symbols, let's process
them, and create the corresponding rules.

* src/parse-gram.y (grammar_declaration): Accept a list of start symbols.
* src/reader.h, src/reader.c (grammar_start_symbol_set): Rename as...
(grammar_start_symbols_set): this.

* src/reader.h, src/reader.c (start_flag): Replace with...
(start_symbols): this.
* src/reader.c (grammar_start_symbols_set): Build a list of start
symbols.
(switching_token, create_start_rules): New.
(check_and_convert_grammar): Use them to turn the list of start
symbols into a set of rules.
* src/reduce.c (nonterminals_reduce): Don't complain about $accept,
it's an internal detail.
(reduce_grammar): Complain about all the start symbols that don't
derive sentences.

* src/symtab.c (startsymbol, startsymbol_loc): Remove, replaced by
start_symbols.
symbols_pack): Move the check about the start symbols
to...
* src/symlist.c (check_start_symbols): here.
Adjust to multiple start symbols.
* tests/reduce.at (Empty Language): Generalize into...
(Bad start symbols): this.
2020-09-27 09:23:51 +02:00
Akim Demaille
db68f61595 regen 2020-09-27 09:23:51 +02:00
Akim Demaille
7eca26e87b parser: expose a list of symbols
* src/parse-gram.y (%type): Also use current_class.
(symbol_decl.1): Rename as...
(symbols.1): this.
2020-09-27 09:23:51 +02:00
Akim Demaille
e50ec28153 reader: get ready to create several initial rules
* src/reader.c (create_start_rule): New.
Use it.
2020-09-27 09:23:50 +02:00
Akim Demaille
f7f2c99c28 gram: more debugging information
* src/gram.c (ritem_print): Show indices in ritem.
2020-09-27 09:23:50 +02:00
Akim Demaille
c08e0863be glr.cc: fix: use symbol_name
* data/skeletons/glr.cc: here.
2020-09-27 09:22:02 +02:00
Akim Demaille
f9b360663b glr2.cc: add an example
Currently this example crashes on input such as "T (x) + y;".
The same example with glr.c works properly.

* examples/c++/glr/Makefile, examples/c++/glr/README.md,
* examples/c++/glr/c++-types.test, examples/c++/glr/c++-types.yy,
* examples/c++/glr/local.mk, examples/c++/local.mk: New.
Based on examples/c/glr/c++-types.y.
2020-09-26 18:33:48 +02:00
Akim Demaille
3add9ffbde glr.cc: fix: use symbol_name
* data/skeletons/glr.cc: here.
2020-09-26 14:33:09 +02:00
Adela Vais
f296669c0f d: change the return value of yylex from int to TokenKind
* data/skeletons/lalr1.d: Change the return value.
* examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
* tests/scanner.at: Adjust.
* tests/calc.at (_AT_DATA_CALC_Y(d)): New, extracted from...
(_AT_DATA_CALC_Y(c)): here.
The two grammars have been sufficiently different to be separated.
Still trying to be them together results in a maintenance burden.  For
the same reason, instead of specifying the results for D and for the
rest, compute the expected results with D from the regular case.
2020-09-26 08:08:25 +02:00
Adela Vais
de638df104 d: support api.parser.extends and api.parser.implements
The D skeleton was not properly supporting them.

* data/skeletons/d.m4: Fix it.
* tests/d.at: Check it.
* tests/local.mk, tests/testsuite.at: Adjust.
2020-09-24 09:29:45 +02:00
Akim Demaille
8dc60543c8 glr2.cc: also run all the calculator tests
This revealed issues with yy_symbol_print and yy_reduce_print.
These changes, in turn, reactivated GCC10 warnings:

    559. calc.at:1258: testing Calculator glr2.cc %locations %header parse.error=verbose %debug api.prefix={calc} %verbose %parse-param {semantic_value *result}{int *count}{int *nerrs}  ...
    tests/calc.at:1258: COLUMNS=1000; export COLUMNS; NO_TERM_HYPERLINKS=1; export NO_TERM_HYPERLINKS;  bison --color=no -fno-caret -Wno-deprecated -o calc.cc calc.y
    tests/calc.at:1258: $CXX $CXXFLAGS $CPPFLAGS  $LDFLAGS -o calc calc.cc calc-lex.cc calc-main.cc $LIBS
    stderr:
    calc.cc: In function 'void glr_stack::yyresolveLocations(glr_state*, int)':
    calc.cc:2623:46: error: potential null pointer dereference [-Werror=null-dereference]
     2623 |                 yyrhsloc[0].getState().yyloc = yyoption->state()->yyloc;
          |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
    calc.cc:2623:46: error: potential null pointer dereference [-Werror=null-dereference]
     2623 |                 yyrhsloc[0].getState().yyloc = yyoption->state()->yyloc;
          |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
    calc.cc:2623:46: error: potential null pointer dereference [-Werror=null-dereference]
     2623 |                 yyrhsloc[0].getState().yyloc = yyoption->state()->yyloc;
          |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
    calc.cc:1177:10: error: potential null pointer dereference [-Werror=null-dereference]
     1177 |   return yypred ? &(asItem (this) - yypred)->getState () : YY_NULLPTR;
          |          ^~~~~~
    calc.cc:2623:46: error: potential null pointer dereference [-Werror=null-dereference]
     2623 |                 yyrhsloc[0].getState().yyloc = yyoption->state()->yyloc;
          |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
    calc.cc:2623:46: error: potential null pointer dereference [-Werror=null-dereference]
     2623 |                 yyrhsloc[0].getState().yyloc = yyoption->state()->yyloc;
          |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
    calc.cc: In member function 'YYRESULTTAG glr_stack::yyresolveValue(glr_state*)':
    calc.cc:2623:46: error: potential null pointer dereference [-Werror=null-dereference]
     2623 |                 yyrhsloc[0].getState().yyloc = yyoption->state()->yyloc;
          |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~

* tests/calc.at (AT_CHECK_CALC_GLR_CC): Also check glr2.cc.
* data/skeletons/glr2.cc: Don't pass the user arguments to
yy_symbol_print and yy_reduce_print, since they have it in the parser
object.
(b4_user_formals_no_comma, b4_pure_args, b4_lpure_args)
(b4_locuser_formals, b4_locuser_args): Remove, useless.
(YY_IGNORE_NULL_DEREFERENCE_BEGIN): Enable for GCC >= 10 too.
2020-09-21 06:43:10 +02:00
Akim Demaille
f8cd049ecc tests: check the location of the right-hand side symbols
The D skeleton was not properly supporting @1 etc.
Reported by Adela Vais.
https://lists.gnu.org/r/bison-patches/2020-09/msg00049.html

* data/skeletons/d.m4 (b4_rhs_location): Fix it.
* tests/calc.at: Check the support of @n for all the skeletons.
2020-09-20 17:24:06 +02:00
Akim Demaille
72946549ed style: formatting changes
* src/scan-code.l: here.
2020-09-20 08:23:28 +02:00
Akim Demaille
bad4fc09a7 style: introduce parse_positional_ref
* src/scan-code.l: here.
2020-09-20 08:23:28 +02:00
Akim Demaille
aac79ca103 style: clarify the way state kernels (aka cores) are built
Use state_list_append in a more natural way.

* src/lr0.c (generate_states): Here.
2020-09-20 08:23:28 +02:00
Akim Demaille
843f99886c style: reorder and comment
* src/reader.h: here.
2020-09-20 08:23:28 +02:00
Akim Demaille
647453a614 examples: add a demonstration of GLR parsers in C
Based on the test case 668 (cxx-type.at:437) "GLR: Merge conflicting
parses, pure, locations".

* examples/c/glr/Makefile, examples/c/glr/README.md,
* examples/c/glr/c++-types.test, examples/c/glr/c++-types.y,
* examples/c/glr/local.mk: New.
2020-09-19 18:05:15 +02:00
Akim Demaille
d4ffb69424 glr: support api.header.include
* data/skeletons/glr.c: here.
2020-09-19 17:50:28 +02:00
Akim Demaille
0711dca9d9 add support for --html
* bootstrap.conf: We need the "execute" module.
* src/files.h, src/files.c (spec_html_file, html_flag): New.
* src/getargs.h, src/getargs.c (--html): New.
* src/print-xml.h, src/print-xml.c (print_html): New.
* src/main.c: Use them.
* tests/output.at, tests/report.at: Check --html.
2020-09-19 17:49:03 +02:00
Akim Demaille
f5d4b64909 regen 2020-09-19 17:49:03 +02:00
Akim Demaille
b327f38832 deprecate %defines in favor of %header
This is consistent with --defines being deprecated in favor of
--header.  The directive %defines is also too similar to %define.
And %header matches nicely with api.header.name.

* src/scan-gram.l (%defines): Deprecate to %header.
(%header): Scan it.
* src/parse-gram.y (PERCENT_DEFINES): Replace with...
(PERCENT_HEADER): this.
* data/skeletons/lalr1.java
* doc/bison.texi
* tests/actions.at, tests/c++.at, tests/calc.at, tests/conflicts.at,
* tests/input.at, tests/java.at, tests/local.at, tests/output.at,
* tests/synclines.at, tests/types.at:
Convert most tests to check %header instead of %defines.
2020-09-19 17:49:03 +02:00
Akim Demaille
75c3746ce2 options: rename --defines as --header
The name "defines" is incorrect, the generated file contains far more
than just #defines.

* src/getargs.h, src/getargs.c (-H, --header): New option.
With optional argument, just like --defines, --xml, etc.
(defines_flag): Rename as...
(header_flag): this.
Adjust dependencies.
* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/glr2.cc, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c:
Adjust.
* examples, doc/bison.texi: Adjust.
* tests/headers.at, tests/local.at, tests/output.at: Convert most
tests from using --defines to using --header.
2020-09-19 08:31:49 +02:00
Akim Demaille
b329f0b5df CI: beware of time limits
* .travis.yml (GCC 8): Run only the part 1 of the tests.
2020-09-17 19:42:46 +02:00
Valentin Tolmer
12a5cc07e0 glr2.cc: replace refs to parser::symbol_kind_type with yysymbol_kind_t
* data/skeletons/glr2.cc: here.
2020-09-17 19:37:05 +02:00
Valentin Tolmer
27b5d92563 glr2.cc: fix container out-of-bounds access
Clang 10 with ASAN enabled reported errors in glr2.cc.

* data/skeletons/glr2.cc: here.
2020-09-17 19:35:47 +02:00
Akim Demaille
34476c449a glr2.cc: disable GCC 4.6 warning
231. conflicts.at:1096: testing Syntax error in consistent error state: glr2.cc ...
    tests/conflicts.at:1096: $CXX $CXXFLAGS $CPPFLAGS  $LDFLAGS -o input input.cc $LIBS
    input.cc: In member function 'YYRESULTTAG glr_stack::yyresolveValue(glr_state*)':
    input.cc:2674:36: error: 'yysval' may be used uninitialized in this function [-Werror=uninitialized]

Do not initialize the variable: this way ASAN can really make sure we
do set it to a proper value.
If we initialize it, ASAN would report nothing.

* data/skeletons/c.m4 (YY_IGNORE_MAYBE_UNINITIALIZED_BEGIN): Disable
GCC 4.6's -Wuninitialized.
* data/skeletons/glr2.cc: Disable the warning locally.
2020-09-15 07:27:00 +02:00
Akim Demaille
795a59aba4 glr2.cc: fix warning with GCC 4.7 and 4.8
231. conflicts.at:1096: testing Syntax error in consistent error state: glr2.cc ...
    tests/conflicts.at:1096: $CXX $CXXFLAGS $CPPFLAGS  $LDFLAGS -o input input.cc $LIBS
    input.cc: In function 'int yyparse(yy::parser&)':
    input.cc:3147:41: error: 'yyarg' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         return yytnamerr_ (yytname_[yysymbol]);
                                             ^
    input.cc:2058:34: note: 'yyarg' was declared here
         yy::parser::symbol_kind_type yyarg[YYERROR_VERBOSE_ARGS_MAXIMUM];
                                      ^

* data/skeletons/glr2.cc (yyreportSyntaxError): Initialize yyarg.
2020-09-14 19:53:01 +02:00
Akim Demaille
243fa94ce2 glr2.cc: simplify symbol kinds
We emit code like

    if (yytoken != yy::parser::symbol_kind::symbol_kind::S_YYEMPTY)

* data/skeletons/glr2.cc: Use b4_symbol correctly.
2020-09-14 19:30:48 +02:00
Akim Demaille
11995fec50 glr2.cc: fix warning about local variable vs. member
Fix a warning triggered in GCC (at least from 4.6 to 4.9):

    input.cc: In constructor 'glr_stack_item::glr_stack_item(bool)':
    input.cc:1371:5: error: declaration of 'is_state' shadows a member of 'this' [-Werror=shadow]
         : is_state_(is_state)
         ^

* data/skeletons/glr2.cc (glr_stack_item): Alpha-convert.
2020-09-14 06:34:07 +02:00
Akim Demaille
c654eb481c c++: variants: minor simplification
Do as Valentin Tolmer did in glr2.cc.

* data/skeletons/variant.hh: The union does not need to be named.
2020-09-13 13:47:25 +02:00