In C/C++, N_ is a no-op. Define it if the user didn't.
Suggested by Frank Heckenbach.
https://lists.gnu.org/r/bug-bison/2020-04/msg00010.html
* src/output.c (prepare_symbol_names): Rename has_translations as
has_translations_flag.
* data/skeletons/bison.m4 (b4_has_translations_if): New.
* data/skeletons/java.m4 (b4_trans): Use it.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c
(N_): Provide a default definition.
symbol_type::token () was removed: it returned the token kind of a
symbol. To do that, one needs to convert from the symbol kind to the
token kind, which requires a table.
This broke some users' unit tests for scanners, see
https://lists.gnu.org/r/bug-bison/2020-01/msg00001.htmlhttps://lists.gnu.org/r/bug-bison/2020-03/msg00020.htmlhttps://lists.gnu.org/r/help-bison/2020-04/msg00005.html
Instead of making this possible again, let's check the symbol's kind
instead. So give proper access to a symbol's kind.
That feature existed, undocumented, as 'type_get()'. Let's rename
this as 'kind()'.
* data/skeletons/c++.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc (type_get): Rename as...
(kind): This.
(type_get): Install a backward compatibility alias.
* doc/bison.texi (Complete Symbols): Document symbol_type and
symbol_type::kind.
Not all the symbols have a fixed symbol code. UNDEF's one is fixed:
-2.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c: here.
* doc/bison.texi (C++ Parser Context): New.
* data/skeletons/lalr1.cc (parser::yysymbol_name): Rename as...
(parser::symbol_name): this.
(A Complete C++ Example): Promote LAC, now that we have it.
Promote parse.error detailed over verbose.
* examples/c++/calc++/calc++.test, tests/local.at: Adjust.
yy::parser features a parse() function, not a yyparse() one.
* data/skeletons/lalr1.cc (yyreport_syntax_error)
(context::yyexpected_tokens): Rename as...
(report_syntax_error, context::expected_tokens): these.
* data/skeletons/bison.m4, data/skeletons/c++.m4, data/skeletons/c.m4,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java:
Refer to the "kind" of a symbol, not its "type", where appropriate.
Because of the insane current implementation of glr.cc, things are a
bit nasty. We will rename symbol_number_type as symbol_type_type
later, to keep this commit small.
* data/skeletons/c++.m4 (b4_declare_symbol_enum): New.
Also define YYNTOKENS to avoid type clashes when yyntokens_ was
actually defined in another enum.
Use it.
(symbol_number_type): Be an alias of symbol_type_type.
Use YYSYMBOL_YYEMPTY and the like.
Use symbol_number_type where appropriate.
(empty_symbol): Remove.
(yytranslate_): Use symbol_number_type, not token_number_type.
* data/skeletons/lalr1.cc: Use symbol_number_type where appropriate.
Adjust to the replacement of empty_symbol by YYSYMBOL_YYEMPTY.
(yy_error_token_, yy_undef_token_, yyeof_, yyntokens_): Remove.
Adjust dependencies.
* data/skeletons/glr.cc: Use symbol_number_type where appropriate.
Forward definitions of YYSYMBOL_YYEMPTY, etc. to glr.c.
* tests/headers.at: Accept YYNTOKENS and other YYSYMBOL_*.
* tests/local.at (AT_YYERROR_DEFINE(c++)): Use symbol_number_type.
We could just "inline yysyntax_error_arguments back" in the routines
it was originally extracted from, but I think the code is nicer to
read this way.
* data/skeletons/glr.c (yysyntax_error_arguments): Generate only for
detailed and verbose error messages.
* data/skeletons/yacc.c: Likewise.
* data/skeletons/lalr1.cc (parser::context::yysyntax_error_arguments):
Move as...
(parser::yysyntax_error_arguments_): this.
And only for detailed and verbose error messages.
The current implementation of parser::context keeps a copy of the
lookahead. This is troublesome since we support move-only types.
Besides, while GCC is happy with the current implementation, Clang
complains that the ctor it needs to build the copy of the lookahead is
not yet available.
461. calc.at:1120: testing Calculator C++ %defines %locations parse.error=verbose %name-prefix "calc" %verbose ...
calc.at:1120: COLUMNS=1000; export COLUMNS; bison --color=no -fno-caret -Wno-deprecated -o calc.cc calc.y
calc.at:1120: $CXX $CXXFLAGS $CPPFLAGS $LDFLAGS -o calc calc.cc calc-lex.cc calc-main.cc $LIBS
stderr:
In file included from calc-lex.cc:7:
calc.hh:351:12: error: instantiation of function 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' required here, but no definition is available [-Werror,-Wundefined-func-template]
struct symbol_type : basic_symbol<by_type>
^
calc.hh:273:7: note: forward declaration of template entity is here
basic_symbol (const basic_symbol& that);
^
calc.hh:351:12: note: add an explicit instantiation declaration to suppress this warning if 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' is explicitly instantiated in another translation unit
struct symbol_type : basic_symbol<by_type>
^
1 error generated.
In file included from calc-main.cc:7:
calc.hh:351:12: error: instantiation of function 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' required here, but no definition is available [-Werror,-Wundefined-func-template]
struct symbol_type : basic_symbol<by_type>
^
calc.hh:273:7: note: forward declaration of template entity is here
basic_symbol (const basic_symbol& that);
^
calc.hh:351:12: note: add an explicit instantiation declaration to suppress this warning if 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' is explicitly instantiated in another translation unit
struct symbol_type : basic_symbol<by_type>
^
1 error generated.
stdout:
calc.at:1120: exit code was 1, expected 0
461. calc.at:1120: 461. Calculator C++ %defines %locations parse.error=verbose %name-prefix "calc" %verbose (calc.at:1120): FAILED (calc.at:1120)
* data/skeletons/lalr1.cc (context::yyla_): Make it a const-ref.
Move the implementation out of the declaration.
Address compiler warnings such as
warning: declaration of 'yyla' shadows a member of 'yy::parser::context' [-Wshadow]
* data/skeletons/lalr1.cc (context): Don't use the same names for
variables and members.
Use foo_ for private members, as in parser.
Also, use the + trick in array accesses to please ICC and provide it
with an int.
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added test cases
* tests/local.at: added yyreport_syntax_error implementation
for C++ test cases
parse.error has more than two possible values.
* data/skeletons/bison.m4 (b4_error_verbose_if, b4_error_verbose_flag):
Remove.
(b4_parse_error_case, b4_parse_error_bmatch): New.
Adjust dependencies.
The C, C++ and D skeletons used to show the stack right after popping
the stack during the reduction. Now that the stack is printed after
reaching a new state, that has become useless:
Entering state 1
Stack now 0 1
Reducing stack by rule 5 (line 83):
$1 = token "number" (1)
-> $$ = nterm exp (1)
Stack now 0
Entering state 8
Stack now 0 8
Remove the "Stack now 0" line.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c:
Here.
Currently, if we have long rules and series of shift, we stack states
without showing stack. Let's be more incremental, and do how the Java
skeleton does.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c:
Here.
Adjust test cases.
* tests/torture.at (AT_DATA_STACK_TORTURE): Disable stack traces: this
test produces a very large stack, and showing the stack each time we
shift a token goes quadatric.
The Java skeleton displays
Reading a token:
Next token is token "number" (1)
while the other display
Reading a token: Next token is token "number" (1)
When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in. So let's propagate the Java way, but with the colon.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.
* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
Let's have C be the reference, and match it elsewhere. Maybe C is too
verbose and some adjustments are needed, but then that would be done
in another batch of patches.
* data/skeletons/lalr1.cc: Print the stack once we popped after
YYERROR, and before emptying the stack at the end of parsing.
Currently the C and C++ parse traces differ in the order in which the
stack is displayed: bottom up in C, top down in C++. Let's stick to
the C order.
* data/skeletons/stack.hh (stack::iterator, stack::const_iterator)
(begin, end): Be forward, not backward.
Now that we use small integral types, possibly unsigned (e.g.,
unsigned char), to store state numbers, using -1 to denote an empty
state (i.e., a state that stores no semantical value) is very
dangerous: it will be confused with state 255, which might be
non-empty.
Rather than allocating a larger range of state numbers to keep the
empty-state apart, let's use the number of a state known to store no
value. The initial state, numbered 0, seems to fit perfectly the job.
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html
* data/skeletons/lalr1.cc (empty_state): Be 0.
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html
The cast is needed when yytranslate_'s argument type is token_type,
i.e., when api.token.constructor is defined.
373. types.at:138: testing lalr1.cc api.value.type=variant api.token.constructor ...
======== Testing with C++ standard flags: ''
../../tests/types.at:138: bison --color=no -fno-caret -o test.cc test.y
../../tests/types.at:138: $CXX $CXXFLAGS $CPPFLAGS $LDFLAGS -o test test.cc $LIBS
stderr:
test.cc:966:16: error: result of comparison of constant 257 with
expression of type 'yy::parser::token_type'
(aka 'yy::parser::token::yytokentype') is always true
[-Werror,-Wtautological-constant-out-of-range-compare]
else if (t <= user_token_number_max_)
~ ^ ~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
It is because it is expected that when api.token.constructor is
defined, only symbol constructors will be used, that yytranslate_ then
takes a token_type. But it is wrong: we still allow literal
characters in this case, as demonstrated by test 373 for instance.
%define api.value.type variant
%define api.token.constructor
%token <std::pair<int, int>> '1' '2';
[...]
static yy::parser::symbol_type yylex ()
{
static char const input[] = "12";
int res = input[toknum++];
typedef yy::parser::symbol_type symbol;
if (res)
return symbol (res, std::make_pair (res - '0', res - '0' + 1));
else
return symbol (res);
}
So let yytranslate_ always take an int, which makes the cast truly
useless.
* data/skeletons/c++.m4, data/skeletons/lalr1.cc (yytranslate_): here.
The C++ implementation of LAC did not skip the $undefined token,
probably because it was not exposed. Expose it, and use clearer
names.
* data/skeletons/c++.m4: Don't define undef_token_ in yytranslate_,
but...
* data/skeletons/lalr1.cc (yy_undef_token_): here.
Use a more precise type to define yy_undef_token_ and yy_error_token_.
Unfortunately we move from a compile-time value defined via an enum to
a static const member. Eventually we should make it constexpr.
Make LAC implementation more alike yacc.c's one.
It is not used at all. We will remove it also from yacc.c, but
later (see TODO).
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java (yyerrcode_):
Remove.
We still have a few old C casts in lalr1.cc, let's get rid of them.
Reported by Frank Heckenbach.
Actually, let's monitor all our casts using easy to grep macros.
Let's use these macros to use the C++ standard casts when we are in
C++.
* data/skeletons/c.m4 (b4_cast_define): New.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/stack.hh,
* data/skeletons/yacc.c:
Use it and/or its casts.
* tests/actions.at, tests/cxx-type.at,
* tests/glr-regression.at, tests/headers.at, tests/torture.at,
* tests/types.at:
Use YY_CAST instead of C casts.
* configure.ac (warn_cxx): Add -Wold-style-cast.
* doc/bison.texi: Disable it.
This skeleton uses a single stack of state structures, so it is less
likely to benefit from a stack size reduction than yacc.c (which uses
several stacks: state number, value and location). But it will reduce
the size of the LAC stack.
This skeleton was already using int for state numbers, so, contrary to
yacc.c, this brings nothing for large automata.
Overall, it is still nicer to make the skeletons alike.
* data/skeletons/lalr1.cc (state_type): Here.
On the CI with GCC 6:
examples/c++/calc++/parser.cc:845:5: error: 'ptrdiff_t' was not declared in this scope
ptrdiff_t yycount = 0;
^~~~~~~~~
examples/c++/calc++/parser.cc:845:5: note: suggested alternatives:
/usr/include/x86_64-linux-gnu/c++/6/bits/c++config.h:202:28: note: 'std::ptrdiff_t'
typedef __PTRDIFF_TYPE__ ptrdiff_t;
^~~~~~~~~
* data/skeletons/lalr1.cc: Qualify ptrdiff_t and size_t with std::.
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
* NEWS: Mention this.
* data/skeletons/c.m4 (b4_int_type):
Prefer char if it will do, and prefer signed types to unsigned if
either will do.
* data/skeletons/glr.c (yy_reduce_print): No need to
convert rule line to unsigned long.
(yyrecoverSyntaxError): Put action into an int to
avoid GCC warning of using a char subscript.
* data/skeletons/lalr1.cc (yy_lac_check_, yysyntax_error_):
Prefer ptrdiff_t to size_t.
* data/skeletons/yacc.c (b4_int_type):
Prefer signed types to unsigned if either will do.
* data/skeletons/yacc.c (b4_declare_parser_state_variables):
(YYSTACK_RELOCATE, YYCOPY, yy_lac_stack_realloc, yy_lac)
(yytnamerr, yysyntax_error, yyparse): Prefer ptrdiff_t to size_t.
(YYPTRDIFF_T, YYPTRDIFF_MAXIMUM): New macros.
(YYSIZE_T): Fix "! defined YYSIZE_T" typo.
(YYSIZE_MAXIMUM): Take the minimum of PTRDIFF_MAX and SIZE_MAX.
(YYSIZEOF): New macro.
(YYSTACK_GAP_MAXIMUM, YYSTACK_BYTES, YYSTACK_RELOCATE)
(yy_lac_stack_realloc, yyparse): Use it.
(YYCOPY, yy_lac_stack_realloc): Cast to YYSIZE_T to pacify GCC.
(yy_reduce_print): Use int instead of unsigned long when int
will do.
(yy_lac_stack_realloc): Prefer long to unsigned long when
either will do.
* tests/regression.at: Adjust to these changes.