Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file. As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.
Give the $end token a visible kind name, YYEOF. Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.
* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
Since Bison 2.7, output was indented four spaces for explanatory
statements. For example:
input.y:2.7-13: error: %type redeclaration for exp
input.y:1.7-11: previous declaration
Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:
input.y:2.7-13: error: %type redeclaration for exp
2 | %type <float> exp
| ^~~~~~~
input.y:1.7-11: note: previous declaration
1 | %type <int> exp
| ^~~~~
* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
On
%token TOKEN1
%type <ival> TOKEN1 TOKEN2 't'
%token TOKEN2
%%
expr:
bison -Wyacc gives
input.y:2.15-20: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
2 | %type <ival> TOKEN1 TOKEN2 't'
| ^~~~~~
input.y:2.29-31: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
2 | %type <ival> TOKEN1 TOKEN2 't'
| ^~~
input.y:2.22-27: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
2 | %type <ival> TOKEN1 TOKEN2 't'
| ^~~~~~
The messages appear to be out of order, but they are emitted when the
error is found.
* src/symtab.h (symbol_class): Add pct_type_sym, used to denote
symbols appearing in %type.
* src/symtab.c (complain_pct_type_on_token): New.
(symbol_class_set): Check that %type is not applied to tokens.
(symbol_check_defined): pct_type_sym also means undefined.
* src/parse-gram.y (symbol_decl.1): Set the class to pct_type_sym.
* src/reader.c (grammar_current_rule_begin): pct_type_sym also means
undefined.
* tests/input.at (Yacc's %type): New.
We have too many global variables, adding structure would help. For a
start, let's hide some of the variables closer to their usage.
* src/getargs.c, src/files.h (current_file): Move to...
* src/scan-gram.c: here.
* src/scan-gram.h (gram_in, gram__flex_debug): Remove, make them
private to the scanner.
* src/reader.h, src/reader.c (reader): Take a grammar file as argument.
Move the handling of scanner variables to...
* src/scan-gram.l (gram_scanner_open, gram_scanner_close): here.
(gram_scanner_initialize): Remove, replaced by gram_scanner_open.
* src/main.c: Adjust.
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
Currently we remove the rhs to install %empty instead.
* src/reader.c (grammar_rule_check_and_complete): Insert the missing
%empty in front of the rhs, not in replacement thereof.
* tests/actions.at (Add missing %empty): Check that.
Some members are called foo_location, others are foo_loc. Stick to
the latter.
* src/gram.h, src/location.h, src/location.c, src/output.c,
* src/parse-gram.y, src/reader.h, src/reader.c, src/reduce.c,
* src/scan-gram.l, src/symlist.h, src/symlist.c, src/symtab.h,
* src/symtab.c:
Use _loc consistently, not _location.
symbol_list features a 'location' and a 'sym_loc' member. The former
is expected to be set only for symbol_lists that denote a symbol (not
a type name), and the latter should only denote the location of the
symbol/type name. Yet both are set, and the name "location" is too
unprecise.
* src/symlist.h, src/symlist.c (symbol_list::location): Rename as
rhs_loc for clarity. Move it to the "section" of data valid only
for rules.
* src/reader.c, src/scan-code.l: Adjust.
* cfg.mk: Disable checks where needed (e.g., we do want to check the
behavior with tabs).
(sc_at_parser_check): Remove. Unfortunately since
a11c144609 we no longer use the './'
prefix to run programs in the current directory. That was so that we
could run Java programs like the other, although they are no run with
the `./` prefix (see 967a59d2c0).
As a consequence this sc check no longer makes sense.
However, since now AT_PARSER_CHECK passes the `./` prefix itself, this
sc-check was superfluous.
* examples/c/reccalc/scan.l: Use memcpy, not strncpy.
* src/ielr.c, src/reader.c: Obfuscate "lr(0)" so that the sc-check for
"space before paren" does not fire.
* tests/diagnostics.at: Avoid space-tab, use tab-tab.
* src/reader.c (grammar_rule_check_and_complete): When 'p' and 'lhs'
are aliases, prefer the latter, for clarity and consistency.
(grammar_current_rule_begin): Avoid 'p', current_rule suffices.
* src/gram.h, src/gram.c: Comment changes.
ptdr# calc.tab.c
This should not be used to generate parsers. My point is actually to
facilitate debugging (when tweaking the generation of the LR(0)
automaton for instance, not carying -yet- about lookaheads).
* src/reader.c (prepare_percent_define_front_end_variables): Add lr(0).
* src/conflicts.c (set_conflicts): Be robust to reds not having
lookaheads at all.
* src/ielr.c (LrType, lr_type_get): Adjust.
(ielr): Implement support for LR(0).
* src/lalr.c (lalr_free): Don't free LA when it's not computed.
This change allows one to document (and check) which rules participate
in shift/reduce and reduce/reduce conflicts. This is particularly
important GLR parsers, where conflicts are a normal occurrence. For
example,
%glr-parser
%expect 1
%%
...
argument_list:
arguments %expect 1
| arguments ','
| %empty
;
arguments:
expression
| argument_list ',' expression
;
...
Looking at the output from -v, one can see that the shift-reduce
conflict here is due to the fact that the parser does not know whether
to reduce arguments to argument_list until it sees the token AFTER the
following ','. By marking the rule with %expect 1 (because there is a
conflict in one state), we document the source of the 1 overall shift-
reduce conflict.
In GLR parsers, we can use %expect-rr in a rule for reduce/reduce
conflicts. In this case, we mark each of the conflicting rules. For
example,
%glr-parser
%expect-rr 1
%%
stmt:
target_list '=' expr ';'
| expr_list ';'
;
target_list:
target
| target ',' target_list
;
target:
ID %expect-rr 1
;
expr_list:
expr
| expr ',' expr_list
;
expr:
ID %expect-rr 1
| ...
;
In a statement such as
x, y = 3, 4;
the parser must reduce x to a target or an expr, but does not know
which until it sees the '='. So we notate the two possible reductions
to indicate that each conflicts in one rule.
See https://lists.gnu.org/archive/html/bison-patches/2013-02/msg00105.html.
* doc/bison.texi (Suppressing Conflict Warnings): Document %expect,
%expect-rr in grammar rules.
* src/conflicts.c (count_state_rr_conflicts): Adjust comment.
(rule_has_state_sr_conflicts): New static function.
(count_rule_sr_conflicts): New static function.
(rule_nast_state_rr_conflicts): New static function.
(count_rule_rr_conflicts): New static function.
(rule_conflicts_print): New static function.
(conflicts_print): Also use rule_conflicts_print to report on individual
rules.
* src/gram.h (struct rule): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* src/reader.c (grammar_midrule_action): Transfer expected_sr_conflicts,
expected_rr_conflicts to new rule, and turn off in current_rule.
(grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
(packgram): Transfer expected_sr_conflicts, expected_rr_conflicts
to new rule.
* src/reader.h (grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
* src/symlist.c (symbol_list_sym_new): Initialize expected_sr_conflicts,
expected_rr_conflicts.
* src/symlist.h (struct symbol_list): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* tests/conflicts.at: Add tests "%expect in grammar rule not enough",
"%expect in grammar rule right.", "%expect in grammar rule too much."
Currently, in C, the default semantic action is implemented by being
always run before running the actual user semantic action. As a
consequence, when the user action is run, $$ is already set as $1.
In C++ with variants, we don't do that, since we cannot manipulate the
semantic value without knowing its exact type. When variants are
enabled, the only guarantee is that $$ is default contructed and ready
to the used.
Some users still would like the default action to be run with
variants. Frank Heckenbach's parser in
C++17 (http://lists.gnu.org/archive/html/bug-bison/2018-04/msg00011.html)
provides this feature, but relying on std::variant's dynamic typing,
which we forbid in lalr1.cc.
The simplest seems to be actually generating the default semantic
action (in all languages/skeletons). This makes the pre-action (that
sets $$ to $1) useless. But... maybe some users depend on this, in
spite of the comments that clearly warn againt this. So let's not
turn this off just yet.
* src/reader.c (grammar_rule_check_and_complete): Rename as...
(grammar_rule_check_and_complete): this.
Install the default semantic action when applicable.
* examples/variant-11.yy, examples/variant.yy, tests/calc.at:
Exercise the default semantic action, even with variants.
The code was already using midrule only, never mid_rule. This is
simpler to remember, and matches a similar change we made from
look-ahead to lookahead.
* NEWS, doc/bison.texi, src/reader.c, src/scan-code.h, src/scan-code.l
* tests/actions.at, tests/c++.at, tests/existing.at: here.
Suggested by Paul Eggert.
* src/reader.c (find_start_symbol): Don't check 'res', we know it is
not null. That suffices to avoid the GCC warnings.
* bootstrap.conf: We don't need 'assume', which doesn't exist anyway.
Commit 3df32101e7 introduced invalid C
code. Caught by GCC 7.3.0.
* bootstrap.conf (gnulib_modules): We need assume.
* src/reader.c (find_start_symbol): Fix the signature (too much C++,
sorry...).
Prefer 'assume' to 'assert', so that we don't have these warnings even
when NDEBUG is defined.
Make sure that we cannot apply a type to the (main) action of a rule.
* src/reader.c (grammar_rule_check): Issue the warning.
* tests/input.at (Cannot type action): Check the warning.
Prompted on Piotr Marcińczyk's message:
http://lists.gnu.org/archive/html/bug-bison/2017-06/msg00000.html.
See also http://lists.gnu.org/archive/html/bug-bison/2018-06/msg00001.html.
Because their type is unknown to Bison, the values of midrule actions are
not treated like the others: they don't have %printer and %destructor
support. In addition, in C++, (Bison) variants cannot work properly.
Typed midrule actions address these issues. Instead of:
exp: { $<ival>$ = 1; } { $<ival>$ = 2; } { $$ = $<ival>1 + $<ival>2; }
write:
exp: <ival>{ $$ = 1; } <ival>{ $$ = 2; } { $$ = $1 + $2; }
* src/scan-code.h, src/scan-code.l (code_props): Add a `type` field to
record the declared type of an action.
(code_props_rule_action_init): Add a type argument.
* src/parse-gram.y: Accept an optional type tag for actions.
* src/reader.h, src/reader.c (grammar_current_rule_action_append): Add
a type argument.
(grammar_midrule_action): When a mid-rule is typed, pass its type to
the defined dummy non terminal symbol.
grammar_current_rule_action_append was used in two different places:
for actual action (`{...}`), and for predicates (`%?{...}`). Let's
split this in two different functions.
* src/reader.h, src/reader.c (grammar_current_rule_predicate_append): New.
Extracted from...
(grammar_current_rule_action_append): here.
Remove arguments that don't apply.
Adjust dependencies.
* origin/maint:
build: don't try to generate docs when cross-compiling
package: fix a reporter's name
%union: fix the support for named %union
package: bump to 2015
flex: don't trust YY_USER_INIT
yacc.c: fix broken union when api.value.type=union and %defines are used
doc: fix missing xref
gnulib: update
location: remove some ugly debugging code traces
build: use abort to pacify compiler errors
package: bump to 2014
doc: specify documentation encoding
Rather than having duplicate info in the symbol and the alias that has
to be resolved later on, both the symbol and the alias have a common
pointer to a separate structure containing this info.
* src/symtab.h (sym_content): New structure.
* src/symtab.c (sym_content_new, sym_content_free, symbol_free): New
* src/AnnotationList.c, src/conflicts.c, src/gram.c, src/gram.h,
* src/graphviz.c, src/ielr.c, src/output.c, src/parse-gram.y, src/print.c
* src/print-xml.c, src/print_graph.c, src/reader.c, src/reduce.c,
* src/state.h, src/symlist.c, src/symtab.c, src/symtab.h, src/tables.c:
Adjust.
* tests/input.at: Fix expectations (order changes).
When reporting a duplicate directive on a rule, point to its first
occurrence:
one.y:11.10-15: error: only one %empty allowed per rule
%empty {} %empty
^^^^^^
one.y:11.3-8: previous declaration
%empty {} %empty
^^^^^^
And consistently discard the second one.
* src/complain.h, src/complain.c (duplicate_directive): New.
* src/reader.c: Use it where appropriate.
* src/symlist.h, src/symlist.c (symbol_list): Add a dprec_location member.
* tests/actions.at: Adjust expected output.
* src/complain.h, src/complain.c (warning_is_unset): New.
* src/reader.c (grammar_current_rule_empty_set): If enabled -Wempty-rule,
if not disabled.
* tests/actions.at (Implicitly empty rule): Check this feature.
Also check that -Wno-empty-rule does disable this warning.