In push parsers, when asking for the list of expected tokens at some
point, it makes no sense to build a yyparse_context_t: the yypstate
alone suffices (the only difference being the lookahead). Instead of
forcing the user to build a useless shell around yypstate, let's offer
yypstate_expected_tokens.
See https://lists.gnu.org/r/bison-patches/2020-03/msg00025.html.
* data/skeletons/yacc.c (yypstate): Declare earlier, so that we can
use it for...
(yypstate_expected_tokens): this new function, when in push parsers.
Adjust dependencies.
* examples/c/bistromathic/parse.y: Simplify: use
yypstate_expected_tokens.
Style fixes.
Reduce scopes (reported by Joel E. Denny).
* upstream/maint:
maint: post-release administrivia
version 3.5.3
news: update for 3.5.3
yacc.c: make sure we properly propagated the user's number for error
diagnostics: don't crash because of repeated definitions of error
style: initialize some struct members
diagnostics: beware of zero-width characters
diagnostics: be sure to close the styling when lines are too short
muscles: fix incorrect decoding of $
code: be robust to reference with invalid tags
build: fix typo
doc: update recommandation for libtextstyle
style: comment changes
examples: use consistently the GFDL header for readmes
style: remove useless declarations
typo: succesful -> successful
README: point to tests/bison, and document --trace
gnulib: update
maint: post-release administrivia
Let's use GNU readline and its TAB autocompletion to demonstrate the
use of yyexpected_tokens.
This shows a number of weaknesses in our current approach:
- some macros (yyssp, etc.) from push parsers "leak" in user code, we
need to undefine them
- the context needed by yyexpected_tokens does not need the token,
yypstate actually suffices
- yypstate is not properly setup when first allocated, which results
in a crash of yyexpected_tokens if fired before a first token was
read. We should move initialization from yypush_parse into
yypstate_new.
* examples/c/bistromathic/parse.y (yylex): Take input as a string, not
a file.
(EXIT): New token.
(input): Adjust to work only on a line.
(line): Remove.
(symbol_count, process_line, expected_tokens, completion)
(init_readline): New.
* examples/c/bistromathic/bistromathic.test: Adjust expectations.
The bistromathic example should not use Flex, it makes it too complex.
But it was the only example to show location tracking with Flex.
* examples/c/lexcalc/lexcalc.test, examples/c/lexcalc/parse.y,
* examples/c/lexcalc/scan.l: Demonstrate location tracking as is done
in bistromathic.
Since Bison 2.7, output was indented four spaces for explanatory
statements. For example:
input.y:2.7-13: error: %type redeclaration for exp
input.y:1.7-11: previous declaration
Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:
input.y:2.7-13: error: %type redeclaration for exp
2 | %type <float> exp
| ^~~~~~~
input.y:1.7-11: note: previous declaration
1 | %type <int> exp
| ^~~~~
* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
bison 3.5.2
* tag 'v3.5.2':
version 3.5.2
news: 3.5.2
gnulib: update
doc: update Doxygen template
java: avoid trailing white spaces
m4: fix b4_token_format
doc: clearly state that %yacc only makes sense with yacc.c
doc: spell check
examples: be more robust to spaces in paths
larlr1.cc: Reject unsupported values for parse.lac
* data/skeletons/glr.c (YYASSERT): Rename as...
(YY_ASSERT): this, for consistency with yacc.c, and also to emphasize
the fact that this is not for the end user (YY_ prefix).
* tests/glr-regression.at: Define parse.assert.
String literals, which allow for better error messages, are (too)
liberally accepted by Bison, which might result in silent errors. For
instance
%type <exVal> cond "condition"
does not define “condition” as a string alias to 'cond' (nonterminal
symbols do not have string aliases). It is rather equivalent to
%nterm <exVal> cond
%token <exVal> "condition"
i.e., it gives the type 'exVal' to the "condition" token, which was
clearly not the intention.
Introduce -Wdangling-alias to catch this.
* src/complain.h, src/complain.c: Add support for -Wdangling-alias.
(argmatch_warning_args): Sort.
* src/symtab.c (symbol_check_defined): Complain about dangling
aliases.
* doc/bison.texi: Document it.
* tests/input.at (Dangling aliases): New test.
On
%token TOKEN1
%type <ival> TOKEN1 TOKEN2 't'
%token TOKEN2
%%
expr:
bison -Wyacc gives
input.y:2.15-20: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
2 | %type <ival> TOKEN1 TOKEN2 't'
| ^~~~~~
input.y:2.29-31: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
2 | %type <ival> TOKEN1 TOKEN2 't'
| ^~~
input.y:2.22-27: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
2 | %type <ival> TOKEN1 TOKEN2 't'
| ^~~~~~
The messages appear to be out of order, but they are emitted when the
error is found.
* src/symtab.h (symbol_class): Add pct_type_sym, used to denote
symbols appearing in %type.
* src/symtab.c (complain_pct_type_on_token): New.
(symbol_class_set): Check that %type is not applied to tokens.
(symbol_check_defined): pct_type_sym also means undefined.
* src/parse-gram.y (symbol_decl.1): Set the class to pct_type_sym.
* src/reader.c (grammar_current_rule_begin): pct_type_sym also means
undefined.
* tests/input.at (Yacc's %type): New.
* data/skeletons/location.cc: Remove the u (for unsigned) suffix from
the initial line and column.
* NEWS: AFAICT, only C++ backends have their location types changed.
This changes the Yacc skeleton to use “least” integer types to
keep tables smaller on some platforms, which should lessen cache
pressure. Since Bison uses the Yacc skeleton, it follows suit.
* data/skeletons/yacc.c: Include limits.h and stdint.h if this
seems to be needed.
(yytype_uint8, yytype_int8, yytype_uint16, yytype_int16):
If available, use GCC predefined macros __INT_MAX__ etc. to select
a “least” type, as this avoids namespace hassles. Otherwise, if
available fall back on selecting a “least” type via the C99 macros
INT_MAX, INT_LEAST8_MAX, etc. Otherwise, fall further back on one of
the builtin C99 types signed char, short, and int. Make sure that
any selected type promotes to int. Ignore any macros YYTYPE_INT16,
YYTYPE_INT8, YYTYPE_UINT16, YYTYPE_UINT8 defined by the user.
(ptrdiff_t, PTRDIFF_MAX): Simplify in the light of the above.
(yytype_uint8, yytype_uint16): Do not assume that unsigned char
and unsigned short promote to int, as this isn’t true on some
platforms (e.g., TI TMS320C55x).
* src/parse-gram.y (YYTYPE_INT16, YYTYPE_INT8, YYTYPE_UINT16)
(YYTYPE_UINT8): Remove, as these are no longer effective.
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
* NEWS: Mention this.
* data/skeletons/c.m4 (b4_int_type):
Prefer char if it will do, and prefer signed types to unsigned if
either will do.
* data/skeletons/glr.c (yy_reduce_print): No need to
convert rule line to unsigned long.
(yyrecoverSyntaxError): Put action into an int to
avoid GCC warning of using a char subscript.
* data/skeletons/lalr1.cc (yy_lac_check_, yysyntax_error_):
Prefer ptrdiff_t to size_t.
* data/skeletons/yacc.c (b4_int_type):
Prefer signed types to unsigned if either will do.
* data/skeletons/yacc.c (b4_declare_parser_state_variables):
(YYSTACK_RELOCATE, YYCOPY, yy_lac_stack_realloc, yy_lac)
(yytnamerr, yysyntax_error, yyparse): Prefer ptrdiff_t to size_t.
(YYPTRDIFF_T, YYPTRDIFF_MAXIMUM): New macros.
(YYSIZE_T): Fix "! defined YYSIZE_T" typo.
(YYSIZE_MAXIMUM): Take the minimum of PTRDIFF_MAX and SIZE_MAX.
(YYSIZEOF): New macro.
(YYSTACK_GAP_MAXIMUM, YYSTACK_BYTES, YYSTACK_RELOCATE)
(yy_lac_stack_realloc, yyparse): Use it.
(YYCOPY, yy_lac_stack_realloc): Cast to YYSIZE_T to pacify GCC.
(yy_reduce_print): Use int instead of unsigned long when int
will do.
(yy_lac_stack_realloc): Prefer long to unsigned long when
either will do.
* tests/regression.at: Adjust to these changes.
* upstream/maint:
c++: add copy ctors for compatibility with the IAR compiler
CI: show git status
CI: disable ICC
tests: pass -jN from Make to the test suite
quotearg: avoid leaks
maint: post-release administrivia
We used to treat lone CRs (\r, aka ^M) as regular NLs (\n), probably
to please Classic MacOS. As of today, it makes more sense to treat \r
like a plain white space character.
https://lists.gnu.org/archive/html/bison-patches/2019-09/msg00027.html
* src/scan-gram.l (no_cr_read): Remove. Instead, use...
(eol): this new abbreviation denoting end-of-line.
* src/location.c (caret_getc): New.
(location_caret): Use it.
* tests/diagnostics.at (Carriage return): Adjust expectations.
(CR NL): New.