Now that yacc.c and glr.c both know yysymbol_type_t, convert the
common routines.
* data/skeletons/c.m4 (yydestruct, yy_symbol_value_print)
(yy_symbol_print): Use yysymbol_type_t instead of int.
* data/skeletons/glr.c: Use yySymbol where appropriate.
* data/skeletons/yacc.c (YY_ACCESSING_SYMBOL): New wrapper around
yystos.
Use it.
* tests/local.at (yyreport_syntax_error): Use yysymbol_type_t where
appropriate.
Apply the same changes as in yacc.c. Now yySymbol and yysymbol_type_t
are aliases. We will remove the former later, to avoid cluttering
this commit.
* data/skeletons/glr.c: Use b4_declare_symbol_enum.
Use YYSYMBOL_YYEOF etc. where appropriate.
(YYUNDEFTOK, YYTERROR): Remove.
(YYTRANSLATE, yySymbol, yyexpected_tokens, yysyntax_error_arguments):
Adjust.
(yy_accessing_symbol): New.
Use it where appropriate.
We could just "inline yysyntax_error_arguments back" in the routines
it was originally extracted from, but I think the code is nicer to
read this way.
* data/skeletons/glr.c (yysyntax_error_arguments): Generate only for
detailed and verbose error messages.
* data/skeletons/yacc.c: Likewise.
* data/skeletons/lalr1.cc (parser::context::yysyntax_error_arguments):
Move as...
(parser::yysyntax_error_arguments_): this.
And only for detailed and verbose error messages.
Because glr.c shares the same testing routines, we also need to
convert it.
* data/skeletons/glr.c (yyparse_context_token): New.
* tests/local.at (yyreport_syntax_error): here.
These macros have been extremely useful when we had to support K&R C,
which we dropped long ago. Now, they merely make the code uselessly
hard to read.
* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/yacc.c:
Stop using b4_function_define.
Prefer b4_parse_error_case over the adhoc solution
`m4_case + b4_percent_define_get`. Same for b4_parse_error_bmatch.
* data/skeletons/glr.c: here
* data/skeletons/yacc.c: here
parse.error has more than two possible values.
* data/skeletons/bison.m4 (b4_error_verbose_if, b4_error_verbose_flag):
Remove.
(b4_parse_error_case, b4_parse_error_bmatch): New.
Adjust dependencies.
The Java skeleton displays
Reading a token:
Next token is token "number" (1)
while the other display
Reading a token: Next token is token "number" (1)
When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in. So let's propagate the Java way, but with the colon.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
When building the test cases, emitting code in the epilogue is very
constraining. Let's make it simpler thanks to %code epilogue.
However, I don't want to document this: it is bad style to use it (we
should avoid having too many ways to write the same thing,
TI!MTOWTDI), just put your code in the true epilogue section.
* data/skeletons/glr.c, data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c: Implement support for %code epilogue.
Remove useless comments.
* tests/calc.at, tests/java.at: Simplify.
Modeled after what was done in yacc.c, yet somewhat different since
yyGLRStack play the role of the full yyparse_context_t. It will
probably be defined as a alias, to make interfaces compatible.
LAC is not supported here. And is somewhat tricky.
* data/skeletons/glr.c (yyexpected_tokens, yysyntax_error_arguments): New.
* data/skeletons/glr.c: Move the handling of error messages after the
definition of types. Especially after yyGLRStack, which we will need
in forthcoming patches.
Simplify: yyGLRStack is defined at this point.
Only parse.error verbose and simple will get the original yytname: the
other options will rely on a different table. So let's move on top of
the yysymbol_name function.
* data/skeletons/c.m4 (yy_symbol_print): Use yysymbol_name.
* data/skeletons/glr.c (yytokenName): Rename as...
(yysymbol_name): this.
The change of naming scheme is unfortunate, but it's definitely glr.c
which is "wrong".
Currently yy_symbol_print is defined before yytokenName, although it
should use it instead of read yytname directly. Move blocks around to
avoid this.
* data/skeletons/glr.c (yy_symbol_print): Move its definition after
that of yytokenName.
* data/skeletons/glr.c (YYASSERT): Rename as...
(YY_ASSERT): this, for consistency with yacc.c, and also to emphasize
the fact that this is not for the end user (YY_ prefix).
* tests/glr-regression.at: Define parse.assert.
The current code for yysyntax_error for %define parse.error verbose is
fishy (given that YYEMPTY is -2, invalid argument for yytname[]):
static int
yysyntax_error ([...])
{
YYPTRDIFF_T yysize0 = yytnamerr (YY_NULLPTR, yytname[yytoken]);
[...]
if (yytoken != YYEMPTY)
A nearby comment reports
The only way there can be no lookahead present (in yychar) is if
this state is a consistent state with a default action. Thus,
detecting the absence of a lookahead is sufficient to determine
that there is no unexpected or expected token to report. In that
case, just report a simple "syntax error".
So it _is_ possible to call yysyntax_error with yytoken == YYEMPTY,
albeit quite difficult when meaning to, so virtually impossible by
accident (after all, there was never a bug report about this).
I failed to produce a test case, but Joel E. Denny provided me with
one (added to the test suite below). The yacc.c skeleton fails on
this, and once fixed dies on a second problem. The glr.c skeleton was
also dying, but immediately of this second problem.
Indeed we were not allocating space for the error message's final \0.
This was hidden by the fact that we only had error messages with at
least an unexpected token displayed, so with at least one "%s" in the
format string, whose size (2) was included (incorrectly) in the final
size of the message (where the %s have been replaced by the actual
content).
* data/skeletons/glr.c, data/skeletons/yacc.c (yysyntax_error):
Do not invoke yytnamerr on YYEMPTY.
Clarify the computation of the length of the _final_ error message,
with the NUL terminator but without the '%s's.
* tests/conflicts.at (Syntax error in consistent error state):
New, contributed by Joel E. Denny.
We still have a few old C casts in lalr1.cc, let's get rid of them.
Reported by Frank Heckenbach.
Actually, let's monitor all our casts using easy to grep macros.
Let's use these macros to use the C++ standard casts when we are in
C++.
* data/skeletons/c.m4 (b4_cast_define): New.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/stack.hh,
* data/skeletons/yacc.c:
Use it and/or its casts.
* tests/actions.at, tests/cxx-type.at,
* tests/glr-regression.at, tests/headers.at, tests/torture.at,
* tests/types.at:
Use YY_CAST instead of C casts.
* configure.ac (warn_cxx): Add -Wold-style-cast.
* doc/bison.texi: Disable it.
That way, glr.c can use it too.
* data/skeletons/c.m4 (b4_int_type):
Do not special-case ‘char’; it’s not worth the trouble,
as clang complains about char subscripts.
(b4_c99_int_type, b4_c99_int_type_define): New macros,
taken from yacc.c.
* data/skeletons/glr.c: Use b4_int_type_define.
* data/skeletons/yacc.c (b4_int_type): Remove, since there’s
no longer any need to redefine it.
Use b4_c99_int_type_define rather than its body.
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
* NEWS: Mention this.
* data/skeletons/c.m4 (b4_int_type):
Prefer char if it will do, and prefer signed types to unsigned if
either will do.
* data/skeletons/glr.c (yy_reduce_print): No need to
convert rule line to unsigned long.
(yyrecoverSyntaxError): Put action into an int to
avoid GCC warning of using a char subscript.
* data/skeletons/lalr1.cc (yy_lac_check_, yysyntax_error_):
Prefer ptrdiff_t to size_t.
* data/skeletons/yacc.c (b4_int_type):
Prefer signed types to unsigned if either will do.
* data/skeletons/yacc.c (b4_declare_parser_state_variables):
(YYSTACK_RELOCATE, YYCOPY, yy_lac_stack_realloc, yy_lac)
(yytnamerr, yysyntax_error, yyparse): Prefer ptrdiff_t to size_t.
(YYPTRDIFF_T, YYPTRDIFF_MAXIMUM): New macros.
(YYSIZE_T): Fix "! defined YYSIZE_T" typo.
(YYSIZE_MAXIMUM): Take the minimum of PTRDIFF_MAX and SIZE_MAX.
(YYSIZEOF): New macro.
(YYSTACK_GAP_MAXIMUM, YYSTACK_BYTES, YYSTACK_RELOCATE)
(yy_lac_stack_realloc, yyparse): Use it.
(YYCOPY, yy_lac_stack_realloc): Cast to YYSIZE_T to pacify GCC.
(yy_reduce_print): Use int instead of unsigned long when int
will do.
(yy_lac_stack_realloc): Prefer long to unsigned long when
either will do.
* tests/regression.at: Adjust to these changes.
Instead of
#define YYPACT_NINF -130
#define yypact_value_is_default(Yystate) \
(!!((Yystate) == (-130)))
generate
#define YYPACT_NINF (-130)
#define yypact_value_is_default(Yyn) \
((Yyn) == YYPACT_NINF)
* data/skeletons/c.m4 (b4_table_value_equals): Add support for $4.
* data/skeletons/glr.c, data/skeletons/yacc.c: Use it.
Also, use shorter macro argument names, the name of the macro is clear
enough.
The CI, with CC='gcc-7 -fsanitize=undefined,address
-fno-omit-frame-pointer', reports:
calc.cc:1652:50: runtime error: load of value 190, which is not a valid value for type 'bool'
../../tests/calc.at:867: cat stderr
--- expout 2019-09-05 20:30:37.887257545 +0000
+++ /home/travis/build/bison-3.4.1.72-79a1-dirty/_build/tests/testsuite.dir/at-groups/438/stdout 2019-09-05 20:30:37.887257545 +0000
@@ -1 +1,2 @@
syntax error
+calc.cc:1652:50: runtime error: load of value 190, which is not a valid value for type 'bool'
438. calc.at:867: 438. Calculator glr.cc (calc.at:867): FAILED (calc.at:867)
The problem is that yylookaheadNeeds is not initialized in
yyinitStateSet, and when it is copied, the value is not 0 or 1.
* data/skeletons/glr.c (yylookaheadNeeds): Initialize yylookaheadNeeds.
Currently, with --no-lines, instead of "#line file line\n", we emit
"\n". Let's emit nothing.
* data/skeletons/bison.m4 (b4_syncline): Emit at end-of-line when enabled.
* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, src/output.c: Use dnl after b4_syncline to
avoid spurious empty lines.
* tests/synclines.at (Sync Lines): Make sure that --no-lines is like
grep -v #line.
* tests/calc.at: Make sure that a rich grammar file behaves properly
with %no-lines.
The variable spec_defines_file denotes the name of the generated
header. Its name is derived from --defines/%defines, whose name in
turn is derived from the fact that the header, in Yacc, contained the
Not only does the header now contain a lot more than just the token
definitions, but we no longer even generate macros, but an enum...
Let's modernize our vocabulary.
* src/files.h, src/files.c (spec_defines_file): Rename as...
(spec_header_file): this.
Reported by Derek Clegg.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00004.html
* configure.ac (warn_common): Add -Wimplicit-fallthrough.
This does trigger failures in the test suite.
* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c, tests/c++.at:
Make fall-throws explicit.