* data/skeletons/glr2.cc: Add support for api.token.constructor.
* examples/c++/glr/c++-types.yy: Use it.
* examples/c++/glr/c++-types.test: Adjust expectations for error
messages.
* examples/c/glr/c++-types.y (node_print): New.
Use YY_LOCATION_PRINT instead of duplicating it.
And actually use it in the action instead of badly duplicating it.
(main): Add proper option support.
* examples/c/glr/c++-types.test: Adjust expectations on locations.
* examples/c++/glr/c++-types.yy: Fix bad iteration.
Reported by Jot Dot.
https://lists.gnu.org/r/help-bison/2020-12/msg00014.html
* data/skeletons/glr.c, data/skeletons/glr2.cc (b4_call_merger): Use
the symbol's slot, not its type.
* examples/c/glr/c++-types.y: Use explicit per-symbol typing together
with api.value.type=union.
(yylex): Use yytoken_kind_t.
* data/skeletons/glr2.cc: Define value_type and location_type where
needed, and use them only.
(yyuserMerge): Make it a member function of the glr_state class.
Currently we are using pointers. The whole point of
glr2.cc (vs. glr.cc) is precisely to allow genuine C++ objects to be
semantic values. Let's make that work.
* data/skeletons/glr2.cc (glr_state::glr_state): Be sure to initialize
yysval.
(glr_state): Add copy-ctor, assignment and dtor.
(glr_state::copyFrom): Be sure to initialize the destination if it was
not.
(glr_state::~glr_state): Destroy the semantic value.
* examples/c++/glr/ast.hh: Rewrite so that we use genuine objects,
rather than a traditional OOP hierarchy that requires to deal with
pointers.
With help from Bruno Belanyi <bruno.belanyi@epita.fr>.
* examples/c++/glr/c++-types.yy: Remove memory management.
Use true objects.
(main): Don't reach yydebug directly.
* examples/c++/glr/local.mk: We need C++11.
A glr_state keeps tracks of its predecessor using an offset relative
to itself (i.e., pointer subtraction). Unfortunately we sometimes
have to compute offsets for pointers that live in different
containers, in particular in yyfillin. In that case there is no
reason for the distance between the two objects to be a multiple of
the object size (0x40 on my machine), and the resulting ptrdiff_t may
be "wrong", i.e., it does allow to recover one from the other. We
cannot use "typed" pointer arithmetics here, the Euclidean division
has it wrong. So use "plain" char* pointers.
Fixes 718 (Duplicate representation of merged trees: glr2.cc) and
examples/c++/glr/c++-types.
Still XFAIL:
712: Improper handling of embedded actions and dollar(-N) in GLR parsers: glr2.cc
730: Incorrectly initialized location for empty right-hand side in GLR: glr2.cc
748: Incorrect lookahead during nondeterministic GLR: glr2.cc
* data/skeletons/glr2.cc (glr_state::as_pointer_): New.
(glr_state::pred): Use it.
* examples/c++/glr/c++-types.test: The test passes.
* tests/glr-regression.at (Duplicate representation of merged trees:
glr2.cc): Passes.
* src/gram.h (rule_is_initial): New.
* src/graphviz.c, src/print-xml.c, src/print.c, src/lalr.c: Use it.
Some of these occurrences were incorrect (checking whether this is
rule 0), and not behaving properly in the case of multistart.
For each start symbol, generate a parsing function with a richer
return value than the usual of yyparse. Reserve a place for the
returned semantic value, in order to avoid having to pass a pointer as
argument to "return" that value. This also makes the call to the
parsing function independent of whether a given start-symbol is typed.
For instance, if the grammar file contains:
%type <int> expression
%start input expression
(so "input" is valueless) we get
typedef struct
{
int yystatus;
} yyparse_input_t;
yyparse_input_t yyparse_input (void);
typedef struct
{
int yyvalue;
int yystatus;
} yyparse_expression_t;
yyparse_expression_t yyparse_expression (void);
This commit also changes the implementation of the parser termination:
when there are multiple start symbols, it is the initial rules that
explicitly YYACCEPT. They do that after having exported the
start-symbol's value (if it is typed):
switch (yyn)
{
case 1: /* $accept: YY_EXPRESSION expression $end */
{ ((*yyvalue).TOK_expression) = (yyvsp[-1].TOK_expression); YYACCEPT; }
break;
case 2: /* $accept: YY_INPUT input $end */
{ YYACCEPT; }
break;
I have tried several ways to deal with termination, and this is the
one that appears the best one to me. It is also the most natural.
* src/scan-code.h, src/scan-code.l (obstack_for_actions): New.
* src/reader.c (grammar_rule_check_and_complete): Generate the actions
of the rules for each start symbol.
* data/skeletons/bison.m4 (b4_symbol_slot): New, with safer semantics
than type and type_tag.
* data/skeletons/yacc.c (b4_accept): New.
Generates the body of the action of the start rules.
(_b4_declare_sub_yyparse): For each start symbol define a dedicated
return type for its parsing function.
Adjust the declaration of its parsing function.
(_b4_define_sub_yyparse): Adjust the definition of the function.
* examples/c/lexcalc/parse.y: Check the case of valueless symbols.
* examples/c/lexcalc/lexcalc.test: Check start symbols.
Now that the parser can read several start symbols, let's process
them, and create the corresponding rules.
* src/parse-gram.y (grammar_declaration): Accept a list of start symbols.
* src/reader.h, src/reader.c (grammar_start_symbol_set): Rename as...
(grammar_start_symbols_set): this.
* src/reader.h, src/reader.c (start_flag): Replace with...
(start_symbols): this.
* src/reader.c (grammar_start_symbols_set): Build a list of start
symbols.
(switching_token, create_start_rules): New.
(check_and_convert_grammar): Use them to turn the list of start
symbols into a set of rules.
* src/reduce.c (nonterminals_reduce): Don't complain about $accept,
it's an internal detail.
(reduce_grammar): Complain about all the start symbols that don't
derive sentences.
* src/symtab.c (startsymbol, startsymbol_loc): Remove, replaced by
start_symbols.
symbols_pack): Move the check about the start symbols
to...
* src/symlist.c (check_start_symbols): here.
Adjust to multiple start symbols.
* tests/reduce.at (Empty Language): Generalize into...
(Bad start symbols): this.
* data/skeletons/lalr1.d: Change the return value.
* examples/d/calc/calc.y, examples/d/simple/calc.y: Adjust.
* tests/scanner.at: Adjust.
* tests/calc.at (_AT_DATA_CALC_Y(d)): New, extracted from...
(_AT_DATA_CALC_Y(c)): here.
The two grammars have been sufficiently different to be separated.
Still trying to be them together results in a maintenance burden. For
the same reason, instead of specifying the results for D and for the
rest, compute the expected results with D from the regular case.
The yyerror stand-alone function was used to bounce from glr.c's call
to yyerror to glr.cc's parser.error. Now that glr.c is out of the
way, just directly use parser.error.
* data/skeletons/glr2.cc (yyerror): Remove.
Adjust callers.
(b4_yyerror_args, b4_lyyerror_args, b4_pure_formals): Remove.
Now unused.
* examples/c/bistromathic/parse.y (user_context): We need the current
line.
(yyreport_syntax_error): Quote the guilty line, with squiggles.
* examples/c/bistromathic/bistromathic.test: Adjust.
* data/bison-default.css: Cobalt does not seem to be supported.
* doc/bison.texi (Counterexamples): A new section.
(Understanding): Show the counterexamples as it shows in the report:
with its items.
(Bison Options): Document -Wcex and -rcex.
Currently when a push parser finishes its parsing (i.e., it did not
return YYPUSH_MORE), it also clears its state. It is therefore
impossible to see if it had parse errors.
In the context of autocompletion, because error recovery might have
fired, the parser is actually already in a different state. For
instance on `(1 + + <TAB>` in the bistromathic, because there's a
`exp: "(" error ")"` recovery rule, `1 + +` tokens have already been
popped, replaced by `error`, and autocompletions think we are ready
for the closing ")". So here, we would like to see if there was a
syntax error, yet `yynerrs` was cleared.
In the case of a successful parse, we still have a problem: if error
recovery succeeded, we won't know it, since, again, `yynerrs` is
clearer.
It seems much more natural to leave the parser state available for
analysis when there is a failure.
To reuse the parser, we should either:
1. provide an explicit means to reinitialize a parser state for future
parses.
2. automatically reset the parser state when it is used in a new
parse.
Option 2 requires to check whether we need to reinitialize the parser
each time we call `yypush_parse`, i.e., each time we give a new token.
This seems expensive compared to Option 1, but benchmarks revealed no
difference. Option 1 is incompatible with the documentation
("After `yypush_parse` returns a status other than `YYPUSH_MORE`, the
parser instance `yyps` may be reused for a new parse.").
So Option 2 wins, reusing the private `yynew` member to record that a
parse was finished, and therefore that the state must reset in the
next call to `yypull_parse`.
While at it, this implementation now reuses the previously enlarged
stacks from one parse to another.
* data/skeletons/yacc.c (yypstate_new): Set up the stacks in their
initial configurations (setting their bottom to the stack array), and
use yypstate_clear to reset them (moving their top to their bottom).
(yypstate_delete): Adjust.
(yypush_parse): At the beginning, clear yypstate if needed, and at the
end, record when yypstate needs to be clearer.
* examples/c/bistromathic/parse.y (expected_tokens): Do not propose
autocompletion when there are parse errors.
* examples/c/bistromathic/bistromathic.test: Check that case.
AFAICT, "dotted rule" is a more frequent synonym of "item" than
"pointed rule". So let's migrate to using "dot" only.
* doc/bison.texi: Use dot/'•' rather than point/'.'.
* src/print-xml.c (print_core): Use dot rather than point. This is
not backward compatible, but AFAICT, we don't have actual user of the
XML output (but ourselves). So...
* data/xslt/xml2dot.xsl, data/xslt/xml2text.xsl,
* data/xslt/xml2xhtml.xsl, tests/report.at: ... adjust.
This was a hack to make it easier for people to migrate from yacc.c to
lalr1.cc and from glr.c to glr.cc: when set, YYSTYPE and YYLTYPE were
`#defined`. It was never documented (just mentioned in NEWS for Bison
2.2, 2006-05-19), but was used to simplify the test suite. Stop that:
adjust the test suite to the skeletons, not the converse.
In C++ use yy::parser::semantic_type, yy::parser::location_type, and
yy::parser::token::MY_TOKEN, instead of YYSTYPE, YYLTYPE and MY_TOKEN.
* data/skeletons/glr.cc, data/skeletons/lalr1.cc: Remove its support.
* tests/actions.at, tests/c++.at, tests/calc.at: Adjust.