Teaches bison about a new command line option, --file-prefix-map OLD=NEW
(based on the -ffile-prefix-map option from GCC) which causes it to
replace and file path of OLD in the text of the output file with NEW,
mainly for header guards and comments. The primary use of this is to
make builds reproducible with different input paths, and in particular
the debugging information produced when the source code is compiled. For
example, a distro may know that the bison source code will be located at
"/usr/src/bison" and thus can generate bison files that are reproducible
with the following command:
bison --output=/build/bison/parse.c -d --file-prefix-map=/build/bison/=/usr/src/bison/ parse.y
Importantly, this will change the header guards and #line directives
from:
#ifndef YY_BUILD_BISON_PARSE_H
#line 100 "/build/bison/parse.h"
to
#ifndef YY_USR_SRC_BISON_PARSE_H
#line 100 "/usr/src/bison/parse.h"
which is reproducible.
See https://lists.gnu.org/r/bison-patches/2020-05/msg00016.html
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
* src/files.h, src/files.c (spec_mapped_header_file)
(mapped_dir_prefix, map_file_name, add_prefix_map): New.
* src/getargs.c (-M, --file-prefix-map): New option.
* src/output.c (prepare): Define b4_mapped_dir_prefix and
b4_spec_header_file.
* src/scan-skel.l (@ofile@): Output the mapped file name.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/location.cc,
* data/skeletons/yacc.c:
Adjust.
* doc/bison.texi: Document.
* tests/input.at, tests/output.at: Check.
The implementation of yy::parser::symbol_name is emitted even before
the implementation of yy::parser::parser. This makes little sense.
* data/skeletons/lalr1.cc (symbol_name): Move its implementation in
the same place as in the class definition: after "error" and before
"context".
Reported by Martin Blais <blais@furius.ca>.
https://lists.gnu.org/r/help-bison/2020-05/msg00005.html
* data/skeletons/lalr1.cc (symbol_name): Make it public.
Add a private hidden hook to enable testing of private parts.
* tests/local.at (AT_DATA_GRAMMAR_PROLOGUE): Help Emacs find the right
language mode.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check that we
can read symbol_name.
The user gives yyexpected_tokens a limit: the max number of tokens she
wants to hear about. That's because an error message that reports a
bazillion of possible tokens is useless.
In that case yyexpected_tokens returned 0, so the user would not know
if there are too many expected tokens or none (yes, that's possible).
There are several ways to tell the user in which situation she's in:
- return some E2MANY, a negative value. Then it makes the pattern
int argsize = yypcontext_expected_tokens (ctx, arg, ARGS_MAX);
if (argsize < 0)
return argsize;
no longer valid, as for E2MANY (i) the user must generate the error
message anyway, and (ii) she should not return E2MANY
- return ARGS_MAX + 1. Then it makes it dangerous for the user, as
she has to iterate update `min (ARGS_MAX, argsize)`.
Returning 0 is definitely simpler and safer for the user, as it tells
her "this is not an error, just generate your message without a list
of expecting tokens". So let's still return 0, but set arg[0] to the
empty token when the list is really empty.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.java
* data/skeletons/yacc.c (yyexpected_tokens): Put the empty symbol
first if there are no possible tokens at all.
* examples/c/bistromathic/parse.y: Demonstrate how to use that.
These are internal details. `type_get ()` is still there to ensure
backward compatibility, `kind ()` being the modern way.
* data/skeletons/c++.m4 (by_type, by_type::type): Rename as...
(by_kind, by_kind::kind_): this.
Adjust dependencies.
* data/skeletons/yacc.c (yyparse): When the scanner returns YYERRCODE,
go directly to error recovery (yyerrlab1).
However, don't keep the error token as lookahead, that token is too
special.
* data/skeletons/lalr1.cc: Likewise.
* examples/c/bistromathic/parse.y (yylex): Use that feature to report
nicely invalid characters.
* examples/c/bistromathic/bistromathic.test: Check that.
* examples/test: Neutralize gratuitous differences such as rule
position.
* tests/calc.at: Check that case in C only.
The other case seem to be working, but that's an illusion that the
next commit will address (in fact, they can enter endless loops, and
report the error several times anyway).
We will not keep YYERRCODE anyway, it causes backward compatibility
issues. So as a first step, let all the skeletons use that name,
until we have a better one.
* data/skeletons/bison.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c, doc/bison.texi, tests/headers.at,
* tests/input.at:
here.
In C/C++, N_ is a no-op. Define it if the user didn't.
Suggested by Frank Heckenbach.
https://lists.gnu.org/r/bug-bison/2020-04/msg00010.html
* src/output.c (prepare_symbol_names): Rename has_translations as
has_translations_flag.
* data/skeletons/bison.m4 (b4_has_translations_if): New.
* data/skeletons/java.m4 (b4_trans): Use it.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c
(N_): Provide a default definition.
symbol_type::token () was removed: it returned the token kind of a
symbol. To do that, one needs to convert from the symbol kind to the
token kind, which requires a table.
This broke some users' unit tests for scanners, see
https://lists.gnu.org/r/bug-bison/2020-01/msg00001.htmlhttps://lists.gnu.org/r/bug-bison/2020-03/msg00020.htmlhttps://lists.gnu.org/r/help-bison/2020-04/msg00005.html
Instead of making this possible again, let's check the symbol's kind
instead. So give proper access to a symbol's kind.
That feature existed, undocumented, as 'type_get()'. Let's rename
this as 'kind()'.
* data/skeletons/c++.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc (type_get): Rename as...
(kind): This.
(type_get): Install a backward compatibility alias.
* doc/bison.texi (Complete Symbols): Document symbol_type and
symbol_type::kind.
Not all the symbols have a fixed symbol code. UNDEF's one is fixed:
-2.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c: here.
* doc/bison.texi (C++ Parser Context): New.
* data/skeletons/lalr1.cc (parser::yysymbol_name): Rename as...
(parser::symbol_name): this.
(A Complete C++ Example): Promote LAC, now that we have it.
Promote parse.error detailed over verbose.
* examples/c++/calc++/calc++.test, tests/local.at: Adjust.
yy::parser features a parse() function, not a yyparse() one.
* data/skeletons/lalr1.cc (yyreport_syntax_error)
(context::yyexpected_tokens): Rename as...
(report_syntax_error, context::expected_tokens): these.
* data/skeletons/bison.m4, data/skeletons/c++.m4, data/skeletons/c.m4,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java:
Refer to the "kind" of a symbol, not its "type", where appropriate.
Because of the insane current implementation of glr.cc, things are a
bit nasty. We will rename symbol_number_type as symbol_type_type
later, to keep this commit small.
* data/skeletons/c++.m4 (b4_declare_symbol_enum): New.
Also define YYNTOKENS to avoid type clashes when yyntokens_ was
actually defined in another enum.
Use it.
(symbol_number_type): Be an alias of symbol_type_type.
Use YYSYMBOL_YYEMPTY and the like.
Use symbol_number_type where appropriate.
(empty_symbol): Remove.
(yytranslate_): Use symbol_number_type, not token_number_type.
* data/skeletons/lalr1.cc: Use symbol_number_type where appropriate.
Adjust to the replacement of empty_symbol by YYSYMBOL_YYEMPTY.
(yy_error_token_, yy_undef_token_, yyeof_, yyntokens_): Remove.
Adjust dependencies.
* data/skeletons/glr.cc: Use symbol_number_type where appropriate.
Forward definitions of YYSYMBOL_YYEMPTY, etc. to glr.c.
* tests/headers.at: Accept YYNTOKENS and other YYSYMBOL_*.
* tests/local.at (AT_YYERROR_DEFINE(c++)): Use symbol_number_type.
We could just "inline yysyntax_error_arguments back" in the routines
it was originally extracted from, but I think the code is nicer to
read this way.
* data/skeletons/glr.c (yysyntax_error_arguments): Generate only for
detailed and verbose error messages.
* data/skeletons/yacc.c: Likewise.
* data/skeletons/lalr1.cc (parser::context::yysyntax_error_arguments):
Move as...
(parser::yysyntax_error_arguments_): this.
And only for detailed and verbose error messages.
The current implementation of parser::context keeps a copy of the
lookahead. This is troublesome since we support move-only types.
Besides, while GCC is happy with the current implementation, Clang
complains that the ctor it needs to build the copy of the lookahead is
not yet available.
461. calc.at:1120: testing Calculator C++ %defines %locations parse.error=verbose %name-prefix "calc" %verbose ...
calc.at:1120: COLUMNS=1000; export COLUMNS; bison --color=no -fno-caret -Wno-deprecated -o calc.cc calc.y
calc.at:1120: $CXX $CXXFLAGS $CPPFLAGS $LDFLAGS -o calc calc.cc calc-lex.cc calc-main.cc $LIBS
stderr:
In file included from calc-lex.cc:7:
calc.hh:351:12: error: instantiation of function 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' required here, but no definition is available [-Werror,-Wundefined-func-template]
struct symbol_type : basic_symbol<by_type>
^
calc.hh:273:7: note: forward declaration of template entity is here
basic_symbol (const basic_symbol& that);
^
calc.hh:351:12: note: add an explicit instantiation declaration to suppress this warning if 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' is explicitly instantiated in another translation unit
struct symbol_type : basic_symbol<by_type>
^
1 error generated.
In file included from calc-main.cc:7:
calc.hh:351:12: error: instantiation of function 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' required here, but no definition is available [-Werror,-Wundefined-func-template]
struct symbol_type : basic_symbol<by_type>
^
calc.hh:273:7: note: forward declaration of template entity is here
basic_symbol (const basic_symbol& that);
^
calc.hh:351:12: note: add an explicit instantiation declaration to suppress this warning if 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' is explicitly instantiated in another translation unit
struct symbol_type : basic_symbol<by_type>
^
1 error generated.
stdout:
calc.at:1120: exit code was 1, expected 0
461. calc.at:1120: 461. Calculator C++ %defines %locations parse.error=verbose %name-prefix "calc" %verbose (calc.at:1120): FAILED (calc.at:1120)
* data/skeletons/lalr1.cc (context::yyla_): Make it a const-ref.
Move the implementation out of the declaration.
Address compiler warnings such as
warning: declaration of 'yyla' shadows a member of 'yy::parser::context' [-Wshadow]
* data/skeletons/lalr1.cc (context): Don't use the same names for
variables and members.
Use foo_ for private members, as in parser.
Also, use the + trick in array accesses to please ICC and provide it
with an int.
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added test cases
* tests/local.at: added yyreport_syntax_error implementation
for C++ test cases
parse.error has more than two possible values.
* data/skeletons/bison.m4 (b4_error_verbose_if, b4_error_verbose_flag):
Remove.
(b4_parse_error_case, b4_parse_error_bmatch): New.
Adjust dependencies.
The C, C++ and D skeletons used to show the stack right after popping
the stack during the reduction. Now that the stack is printed after
reaching a new state, that has become useless:
Entering state 1
Stack now 0 1
Reducing stack by rule 5 (line 83):
$1 = token "number" (1)
-> $$ = nterm exp (1)
Stack now 0
Entering state 8
Stack now 0 8
Remove the "Stack now 0" line.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c:
Here.
Currently, if we have long rules and series of shift, we stack states
without showing stack. Let's be more incremental, and do how the Java
skeleton does.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c:
Here.
Adjust test cases.
* tests/torture.at (AT_DATA_STACK_TORTURE): Disable stack traces: this
test produces a very large stack, and showing the stack each time we
shift a token goes quadatric.
The Java skeleton displays
Reading a token:
Next token is token "number" (1)
while the other display
Reading a token: Next token is token "number" (1)
When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in. So let's propagate the Java way, but with the colon.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.
* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
Let's have C be the reference, and match it elsewhere. Maybe C is too
verbose and some adjustments are needed, but then that would be done
in another batch of patches.
* data/skeletons/lalr1.cc: Print the stack once we popped after
YYERROR, and before emptying the stack at the end of parsing.
Currently the C and C++ parse traces differ in the order in which the
stack is displayed: bottom up in C, top down in C++. Let's stick to
the C order.
* data/skeletons/stack.hh (stack::iterator, stack::const_iterator)
(begin, end): Be forward, not backward.