The C, C++ and D skeletons used to show the stack right after popping
the stack during the reduction. Now that the stack is printed after
reaching a new state, that has become useless:
Entering state 1
Stack now 0 1
Reducing stack by rule 5 (line 83):
$1 = token "number" (1)
-> $$ = nterm exp (1)
Stack now 0
Entering state 8
Stack now 0 8
Remove the "Stack now 0" line.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c:
Here.
Currently, if we have long rules and series of shift, we stack states
without showing stack. Let's be more incremental, and do how the Java
skeleton does.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c:
Here.
Adjust test cases.
* tests/torture.at (AT_DATA_STACK_TORTURE): Disable stack traces: this
test produces a very large stack, and showing the stack each time we
shift a token goes quadatric.
The Java skeleton displays
Reading a token:
Next token is token "number" (1)
while the other display
Reading a token: Next token is token "number" (1)
When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in. So let's propagate the Java way, but with the colon.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
* tests/local.at (AT_LANG_MATCH): New.
(AT_YYERROR_DECLARE(java), AT_YYERROR_DECLARE_EXTERN(java)): New.
* tests/calc.at: The grammar file for Java is quite different for the
others, and continuing to assemble it from pieces makes the grammar
file hard to understand. Let's also dispatch on the language to
assemble it, and isolate Java from the others.
Most of this comes from java.at.
This example, so far, was tracking the current token number, not the
current column number. This is not nice for an example...
* examples/java/Calc.y (PositionReader): New.
Use it.
* examples/java/Calc.test: Check the output.
AFAICT, autoboxing/unboxing was added in Java 5 (September 30, 2004).
I think we can afford to use it. It should help us merge some Java
tests with the main ones.
However, beware that != does not unbox: it compares the object
addresses.
* examples/java/Calc.y, tests/java.at: Simplify.
* examples/java/Calc.test, tests/java.at: Improve tests.
When building the test cases, emitting code in the epilogue is very
constraining. Let's make it simpler thanks to %code epilogue.
However, I don't want to document this: it is bad style to use it (we
should avoid having too many ways to write the same thing,
TI!MTOWTDI), just put your code in the true epilogue section.
* data/skeletons/glr.c, data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c: Implement support for %code epilogue.
Remove useless comments.
* tests/calc.at, tests/java.at: Simplify.
Modeled after what was done in yacc.c, yet somewhat different since
yyGLRStack play the role of the full yyparse_context_t. It will
probably be defined as a alias, to make interfaces compatible.
LAC is not supported here. And is somewhat tricky.
* data/skeletons/glr.c (yyexpected_tokens, yysyntax_error_arguments): New.
* data/skeletons/glr.c: Move the handling of error messages after the
definition of types. Especially after yyGLRStack, which we will need
in forthcoming patches.
Simplify: yyGLRStack is defined at this point.
The PDF output is more consistent: some nodes were not pointed to with
their title. The HTML output becomes "see section Foo" instead of
"see Foo", but this should be addressed in the next Texinfo release.
Info output is simplified, as it uses only the node name and not its
title. But it's considered easier to read this way.
See https://lists.gnu.org/r/help-texinfo/2020-01/msg00031.html.
* doc/bison.texi: Set @xrefautomaticsectiontitle on.
Simplify all uses of ref.
Add an example to demonstrate the use of push parser. I'm pleasantly
surprised: parse.error=detailed works like a charm with push parsers.
* examples/c/local.mk, examples/c/pushcalc/Makefile
* examples/c/pushcalc/README.md, examples/c/pushcalc/calc.test,
* examples/c/pushcalc/calc.y, examples/c/pushcalc/local.mk:
New.
* examples/c/calc/calc.y: Restore to its original state, with
parse.error=detailed instead of parse.error=custom (this example
should be simple).
* examples/c/calc/calc.test: Check syntax errors.
* examples/c/lexcalc/parse.y: Add comments.
We used to display the unexpected token first:
$ bison foo.y
foo.y:1.8-13: error: syntax error, unexpected %token, expecting character literal or identifier or <tag>
1 | %token %token
| ^~~~~~
GCC uses a different format:
$ gcc-mp-9 foo.c
foo.c:1:5: error: expected identifier or '(' before ')' token
1 | int()()()
| ^
and so does Clang:
$ clang-mp-9.0 foo.c
foo.c:1:5: error: expected identifier or '('
int()()()
^
1 error generated.
They display the unexpected token last (or not at all). Also, they
don't waste width with "syntax error". Let's try that. It gives, for
the same example as above:
$ bison foo.y
foo.y:1.8-13: error: expected character literal or identifier or <tag> before %token
1 | %token %token
| ^~~~~~
* src/complain.h, src/complain.c (syntax_error): New.
* src/parse-gram.y (yyreport_syntax_error): Use it.
As a test case, support translations in Bison itself.
* src/parse-gram.y: Mark the translatable tokens.
While at it, use clearer names.
* tests/input.at: Adjust expectations.
Don't force callers of location_caret to have to deal with flags that
disable it.
* src/location.h, src/location.c (location_caret)
(location_caret_suggestion): Early return if disabled.
* src/complain.c: Simplify.
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.
* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.
* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
Some users would like to avoid having to "parse" the *.y file to find
the strings to translate. Let's issue the translatable tokens with N_
to allow "parsing" the generated parsers instead.
See
https://lists.gnu.org/archive/html/bison-patches/2019-01/msg00015.html
* src/output.c (prepare_symbol_names): Issue symbol_names with N_()
markup.
In addition to
%token NUM "number"
accept
%token NUM _("number")
in which case the token will be translated in error messages.
Do not use _() in the output if there are no translatable tokens.
* src/symtab.h, src/symtab.c (symbol): Add a 'translatable' member.
* src/parse-gram.y (TSTRING): New token.
(string_as_id.opt): Replace with...
(alias): this.
Use it.
* src/scan-gram.l (SC_ESCAPED_TSTRING): New start conditions, to match
TSTRINGs.
* src/output.c (prepare_symbols): Define b4_translatable if there are
translatable strings.
* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c (yytnamerr): Receive b4_translatable, and use it.
* src/output.c (escape_trigraphs, xescape_trigraphs): New.
(prepare_symbol_names): Use it.
* tests/regression.at: Check the handling of trigraphs with
parse.error = detailed.