The error format should be translated, but contrary to the case of
C/C++, we cannot just depend on macros to adapt on the
presence/absence of '_'. Let's consider that the message format is to
be translated iff there are some internationalized tokens.
* src/output.c (prepare_symbol_names): Define b4_has_translations.
* data/skeletons/java.m4 (b4_trans): New.
* data/skeletons/lalr1.java: Use it to emit translatable or not the
format string.
And allow syntax error messages for 'detailed' and 'verbose' to be
translated.
* data/skeletons/lalr1.java (Context, yyexpectedTokens)
(yysyntaxErrorArguments): New.
(yysyntax_error): Use it.
In Java there is no need for N_ and yytranslate_. So instead of
hard-coding the use of N_ in the table of the symbol names, rely on
b4_symbol_translate.
* src/output.c (prepare_symbol_names): Use b4_symbol_translate instead
of N_.
* data/skeletons/c.m4 (b4_symbol_translate): New.
* data/skeletons/lalr1.java (yysymbolName): New.
Use it.
* examples/java/calc/Calc.y: Use parse.error=detailed.
* tests/calc.at: Check parse.error=detailed.
We used to emit:
/** Token number,to be returned by the scanner. */
static final int NUM = 258;
/** Token number,to be returned by the scanner. */
static final int NEG = 259;
with no space after the comma. Fix that.
* data/skeletons/bison.m4 (b4_token_format): Quote where appropriate.
Unfortunately in the Java skeleton the user cannot override the way
locations are displayed, and locations don't know the structure of the
positions. So they cannot implement the tricks used in the C/C++
skeletons to display "1.1" instead of "1.1-1.2".
* tests/local.at (Java): Add support for column tracking in the
locations, as we did in examples/java/calc.
* tests/calc.at: Use AT_CALC_YYLEX.
Soon calculator tests for Java will move from java.at to calc.at.
Which implies improving the Java testing infrastructure in
local.at (for instance really tracking columns in positions, not just
token number). Detach java.at from local.at.
* tests/java.at (AT_JAVA_POSITION_DEFINE_OLD): New.
Use it.
* examples/java/calc/Calc.y: The StreamTokenizer cannot "peek" for the
next character, it reads it, and keeps it for the next call. So the
current location is one passed the end of the current token. To avoid
this, keep the previous position, and use it to end the current token.
* examples/java/calc/Calc.test: Adjust.
The C, C++ and D skeletons used to show the stack right after popping
the stack during the reduction. Now that the stack is printed after
reaching a new state, that has become useless:
Entering state 1
Stack now 0 1
Reducing stack by rule 5 (line 83):
$1 = token "number" (1)
-> $$ = nterm exp (1)
Stack now 0
Entering state 8
Stack now 0 8
Remove the "Stack now 0" line.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c:
Here.
Currently, if we have long rules and series of shift, we stack states
without showing stack. Let's be more incremental, and do how the Java
skeleton does.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c:
Here.
Adjust test cases.
* tests/torture.at (AT_DATA_STACK_TORTURE): Disable stack traces: this
test produces a very large stack, and showing the stack each time we
shift a token goes quadatric.
The Java skeleton displays
Reading a token:
Next token is token "number" (1)
while the other display
Reading a token: Next token is token "number" (1)
When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in. So let's propagate the Java way, but with the colon.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
* tests/local.at (AT_LANG_MATCH): New.
(AT_YYERROR_DECLARE(java), AT_YYERROR_DECLARE_EXTERN(java)): New.
* tests/calc.at: The grammar file for Java is quite different for the
others, and continuing to assemble it from pieces makes the grammar
file hard to understand. Let's also dispatch on the language to
assemble it, and isolate Java from the others.
Most of this comes from java.at.
This example, so far, was tracking the current token number, not the
current column number. This is not nice for an example...
* examples/java/Calc.y (PositionReader): New.
Use it.
* examples/java/Calc.test: Check the output.
AFAICT, autoboxing/unboxing was added in Java 5 (September 30, 2004).
I think we can afford to use it. It should help us merge some Java
tests with the main ones.
However, beware that != does not unbox: it compares the object
addresses.
* examples/java/Calc.y, tests/java.at: Simplify.
* examples/java/Calc.test, tests/java.at: Improve tests.
When building the test cases, emitting code in the epilogue is very
constraining. Let's make it simpler thanks to %code epilogue.
However, I don't want to document this: it is bad style to use it (we
should avoid having too many ways to write the same thing,
TI!MTOWTDI), just put your code in the true epilogue section.
* data/skeletons/glr.c, data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c: Implement support for %code epilogue.
Remove useless comments.
* tests/calc.at, tests/java.at: Simplify.
Modeled after what was done in yacc.c, yet somewhat different since
yyGLRStack play the role of the full yyparse_context_t. It will
probably be defined as a alias, to make interfaces compatible.
LAC is not supported here. And is somewhat tricky.
* data/skeletons/glr.c (yyexpected_tokens, yysyntax_error_arguments): New.
* data/skeletons/glr.c: Move the handling of error messages after the
definition of types. Especially after yyGLRStack, which we will need
in forthcoming patches.
Simplify: yyGLRStack is defined at this point.
The PDF output is more consistent: some nodes were not pointed to with
their title. The HTML output becomes "see section Foo" instead of
"see Foo", but this should be addressed in the next Texinfo release.
Info output is simplified, as it uses only the node name and not its
title. But it's considered easier to read this way.
See https://lists.gnu.org/r/help-texinfo/2020-01/msg00031.html.
* doc/bison.texi: Set @xrefautomaticsectiontitle on.
Simplify all uses of ref.
Add an example to demonstrate the use of push parser. I'm pleasantly
surprised: parse.error=detailed works like a charm with push parsers.
* examples/c/local.mk, examples/c/pushcalc/Makefile
* examples/c/pushcalc/README.md, examples/c/pushcalc/calc.test,
* examples/c/pushcalc/calc.y, examples/c/pushcalc/local.mk:
New.
* examples/c/calc/calc.y: Restore to its original state, with
parse.error=detailed instead of parse.error=custom (this example
should be simple).
* examples/c/calc/calc.test: Check syntax errors.
* examples/c/lexcalc/parse.y: Add comments.
We used to display the unexpected token first:
$ bison foo.y
foo.y:1.8-13: error: syntax error, unexpected %token, expecting character literal or identifier or <tag>
1 | %token %token
| ^~~~~~
GCC uses a different format:
$ gcc-mp-9 foo.c
foo.c:1:5: error: expected identifier or '(' before ')' token
1 | int()()()
| ^
and so does Clang:
$ clang-mp-9.0 foo.c
foo.c:1:5: error: expected identifier or '('
int()()()
^
1 error generated.
They display the unexpected token last (or not at all). Also, they
don't waste width with "syntax error". Let's try that. It gives, for
the same example as above:
$ bison foo.y
foo.y:1.8-13: error: expected character literal or identifier or <tag> before %token
1 | %token %token
| ^~~~~~
* src/complain.h, src/complain.c (syntax_error): New.
* src/parse-gram.y (yyreport_syntax_error): Use it.