"detailed" error messages are almost like "verbose", except that we
don't double escape them, they don't get inner quotes, we don't use
yytnamerr, and we hide the table.
"custom" is exposed with the "detailed" tokens, not the "verbose"
ones: they are not double-quoted.
Because there's a risk that some people use yytname even without
"verbose", let's keep yytname (instead of yys_name) in "simple"
parse.error.
* src/output.c (prepare_symbol_names): Be ready to output symbol names
unquoted.
(prepare_symbol_names): Output both the old tname table, and the new
symbol_names one.
* data/skeletons/bison.m4: Accept 'detailed'.
* data/skeletons/yacc.c: When parse.error is 'detailed', don't emit
yytname and yytnamerr, just yysymbol_name with the table inside.
* tests/calc.at: Adjust.
Only parse.error verbose and simple will get the original yytname: the
other options will rely on a different table. So let's move on top of
the yysymbol_name function.
* data/skeletons/c.m4 (yy_symbol_print): Use yysymbol_name.
* data/skeletons/glr.c (yytokenName): Rename as...
(yysymbol_name): this.
The change of naming scheme is unfortunate, but it's definitely glr.c
which is "wrong".
Currently yy_symbol_print is defined before yytokenName, although it
should use it instead of read yytname directly. Move blocks around to
avoid this.
* data/skeletons/glr.c (yy_symbol_print): Move its definition after
that of yytokenName.
Currently we get warnings with GCC 4.8 when running the
maintainer-check-g++ tests:
143. skeletons.at:85: testing Installed skeleton file names ...
../../tests/skeletons.at:120: COLUMNS=1000; export COLUMNS; bison --color=no -fno-caret --skeleton=yacc.c -o input-cmd-line.c input-cmd-line.y
../../tests/skeletons.at:121: $CC $CFLAGS $CPPFLAGS $LDFLAGS -o input-cmd-line input-cmd-line.c $LIBS
stderr:
input-cmd-line.c: In function 'int yysyntax_error(long int*, char**, const yyparse_context_t*)':
input-cmd-line.c:977:52: error: conversion to 'int' from 'long int' may alter its value [-Werror=conversion]
YYSIZEOF (yyarg) / YYSIZEOF (*yyarg));
^
cc1plus: all warnings being treated as errors
stdout:
../../tests/skeletons.at:121: exit code was 1, expected 0
and
429. calc.at:823: testing Calculator parse.error=custom %locations api.prefix={calc} ...
../../tests/calc.at:823: COLUMNS=1000; export COLUMNS; bison --color=no -fno-caret -Wno-deprecated -o calc.c calc.y
../../tests/calc.at:823: $CC $CFLAGS $CPPFLAGS $LDFLAGS -o calc calc.c $LIBS
stderr:
calc.y: In function 'int yyreport_syntax_error(const yyparse_context_t*)':
calc.y:157:58: error: conversion to 'int' from 'long unsigned int' may alter its value [-Werror=conversion]
int n = yysyntax_error_arguments (ctx, arg, sizeof arg / sizeof *arg);
^
cc1plus: all warnings being treated as errors
stdout:
../../tests/calc.at:823: exit code was 1, expected 0
We could use a cast to avoid the warning, but it becomes too
cluttered. We can also use YYPTRDIFF_T, but that forces the user to
use YYPTRDIFF_T too, although this is an array of tokens, which is
limited by YYNTOKENS, an int. So let's completely avoid this warning.
* data/skeletons/yacc.c, tests/local.at (yyreport_syntax_error): Avoid
relying on sizeof to compute the array capacity.
Enhance the calculator tests: show that passing arguments to yyerror
works.
* tests/calc.at: Add a new parse-param, nerrs, which counts the number
of syntax errors in a run.
* tests/local.at: Adjust to handle the new 'nerrs' argument, when
present.
The custom error reporting function show sees the user's additional
arguments. Let's experiment with passing them as arguments to
yyreport_syntax_error, but maybe storing them in the context would be
a bettter alternative.
* data/skeletons/yacc.c (yyreport_syntax_error): Handle the
parse-params.
* tests/calc.at, tests/local.at: Adjust.
Provide users with a means to query for the currently allowed tokens.
Could be used for autocompletion for instance.
* data/skeletons/yacc.c (yyexpected_tokens): New, extracted from
yysyntax_error_arguments.
* examples/c/calc/calc.y (PRINT_EXPECTED_TOKENS): New.
Use it.
When parse.error is custom, let users define a yyreport_syntax_error
function, and use it.
* data/skeletons/bison.m4 (b4_error_verbose_if): Accept 'custom'.
* data/skeletons/yacc.c: Implement it.
* examples/c/calc/calc.y: Experiment with it.
That allows users to cover more cases, such as easily filtering some
arguments they don't want to expose. But they now have to call
yysymbol_name explicitly.
* data/skeletons/yacc.c (yysyntax_error_arguments, yysyntax_error):
Deal with symbol numbers instead of symbol names.
Isolate a function that returns the list of expected and unexpected
tokens. It will be exposed to users willing to customize their error
messages.
* data/skeletons/yacc.c (yyparse_context_t): New.
(yyerror_message_arguments): New, extracted from yysyntax_error.
Let's have C be the reference, and match it elsewhere. Maybe C is too
verbose and some adjustments are needed, but then that would be done
in another batch of patches.
* data/skeletons/lalr1.cc: Print the stack once we popped after
YYERROR, and before emptying the stack at the end of parsing.
Currently the C and C++ parse traces differ in the order in which the
stack is displayed: bottom up in C, top down in C++. Let's stick to
the C order.
* data/skeletons/stack.hh (stack::iterator, stack::const_iterator)
(begin, end): Be forward, not backward.
Provide the users with a public API to get the name of the tokens. A
thin wrapper around yytname.
* data/skeletons/yacc.c (yysymbol_name): New.
Use it.
Supporting YYERROR_VERBOSE via cpp is a nuisance: m4 is in charge of
handling alternatives. When adding more options for %define
parse.error, supporting both CPP and M4 is too complex. Anyway,
YYERROR_VERBOSE was deprecated long ago.
* data/skeletons/yacc.c: Use m4 only to handle verbose/simple error
messages.
I would like to offer new ways to build the error message. As a first
step, let's simplify yysyntax_error whose first loop does two things
at the same time: (i) collect the tokens to be reported in the error
message, and (ii) accumulate their sizes and possibly return
"overflow". Let's pull (ii) in a second step.
Then test 525 (regression.at:1193: parse.error=verbose overflow)
failed. This test checks that we correctly report "memory overflow"
when the error message is too large. However the test is mistaken: it
is triggered in a place where there are five (large) expected tokens,
so anyway we would not display them, so there is no (memory) overflow
here! Transform this test to (i) check that indeed there is no
overflow, and (ii) create syntax_error3 which does check the intended
behavior, but with four expected tokens.
* data/skeletons/yacc.c (yysyntax_error): First compute the list of
arguments, then compute yysize.
* tests/regression.at (parse.error=verbose overflow): Enhance and fix.
Problem reported by Andy Fiddaman in:
https://lists.gnu.org/r/bug-bison/2019-12/msg00021.html
* data/skeletons/yacc.c (yy_reduce_print, yy_lac, yysyntax_error)
(yyreturn): If I might be a char, write a[+I] instead of a[I],
so that ‘gcc -Wchar-subscripts’ does not complain.
Another breakage revealed by vcsn.
* data/skeletons/c++.m4 (yytranslate_): Do not hard code "yy" and
"parser", both can be changed by the user.
Actually, since we are in the parser itself, there's really no need to
qualify the type.
* data/skeletons/glr.c (YYASSERT): Rename as...
(YY_ASSERT): this, for consistency with yacc.c, and also to emphasize
the fact that this is not for the end user (YY_ prefix).
* tests/glr-regression.at: Define parse.assert.
Now that we use small integral types, possibly unsigned (e.g.,
unsigned char), to store state numbers, using -1 to denote an empty
state (i.e., a state that stores no semantical value) is very
dangerous: it will be confused with state 255, which might be
non-empty.
Rather than allocating a larger range of state numbers to keep the
empty-state apart, let's use the number of a state known to store no
value. The initial state, numbered 0, seems to fit perfectly the job.
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html
* data/skeletons/lalr1.cc (empty_state): Be 0.
It is not used. And its implementation was wrong when api.token.raw
was defined, as it was still mapping to the external token numbers,
instead of the internal ones. Besides it was provided only when
api.token.constructor is defined, yet always declared.
* data/skeletons/c++.m4 (by_type::token): Remove, useless.
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html
The cast is needed when yytranslate_'s argument type is token_type,
i.e., when api.token.constructor is defined.
373. types.at:138: testing lalr1.cc api.value.type=variant api.token.constructor ...
======== Testing with C++ standard flags: ''
../../tests/types.at:138: bison --color=no -fno-caret -o test.cc test.y
../../tests/types.at:138: $CXX $CXXFLAGS $CPPFLAGS $LDFLAGS -o test test.cc $LIBS
stderr:
test.cc:966:16: error: result of comparison of constant 257 with
expression of type 'yy::parser::token_type'
(aka 'yy::parser::token::yytokentype') is always true
[-Werror,-Wtautological-constant-out-of-range-compare]
else if (t <= user_token_number_max_)
~ ^ ~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
It is because it is expected that when api.token.constructor is
defined, only symbol constructors will be used, that yytranslate_ then
takes a token_type. But it is wrong: we still allow literal
characters in this case, as demonstrated by test 373 for instance.
%define api.value.type variant
%define api.token.constructor
%token <std::pair<int, int>> '1' '2';
[...]
static yy::parser::symbol_type yylex ()
{
static char const input[] = "12";
int res = input[toknum++];
typedef yy::parser::symbol_type symbol;
if (res)
return symbol (res, std::make_pair (res - '0', res - '0' + 1));
else
return symbol (res);
}
So let yytranslate_ always take an int, which makes the cast truly
useless.
* data/skeletons/c++.m4, data/skeletons/lalr1.cc (yytranslate_): here.
The C++ implementation of LAC did not skip the $undefined token,
probably because it was not exposed. Expose it, and use clearer
names.
* data/skeletons/c++.m4: Don't define undef_token_ in yytranslate_,
but...
* data/skeletons/lalr1.cc (yy_undef_token_): here.
Use a more precise type to define yy_undef_token_ and yy_error_token_.
Unfortunately we move from a compile-time value defined via an enum to
a static const member. Eventually we should make it constexpr.
Make LAC implementation more alike yacc.c's one.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: Don't expose
yyuser_token_number_max_ and yyundef_token_. Do as in C++: scope them
into yytranslate_, and only when api.token.raw is not defined.
(yyterror_): Rename as...
(yy_error_token_): this.
* data/skeletons/lalr1.d (token_number_type): New.
Use it.
Can't be done in the Java backend, as Java does not have type aliases.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java (yytoken_number_):
Remove, useless.
Was used in ancient C skeletons to support YYPRINT, long obsoleted by
%printer.
It is not used at all. We will remove it also from yacc.c, but
later (see TODO).
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java (yyerrcode_):
Remove.