Now that the parser can read several start symbols, let's process
them, and create the corresponding rules.
* src/parse-gram.y (grammar_declaration): Accept a list of start symbols.
* src/reader.h, src/reader.c (grammar_start_symbol_set): Rename as...
(grammar_start_symbols_set): this.
* src/reader.h, src/reader.c (start_flag): Replace with...
(start_symbols): this.
* src/reader.c (grammar_start_symbols_set): Build a list of start
symbols.
(switching_token, create_start_rules): New.
(check_and_convert_grammar): Use them to turn the list of start
symbols into a set of rules.
* src/reduce.c (nonterminals_reduce): Don't complain about $accept,
it's an internal detail.
(reduce_grammar): Complain about all the start symbols that don't
derive sentences.
* src/symtab.c (startsymbol, startsymbol_loc): Remove, replaced by
start_symbols.
symbols_pack): Move the check about the start symbols
to...
* src/symlist.c (check_start_symbols): here.
Adjust to multiple start symbols.
* tests/reduce.at (Empty Language): Generalize into...
(Bad start symbols): this.
Currently the suggestion to rerun is a -Wother warning:
warning: 2 shift/reduce conflicts [-Wconflicts-sr]
warning: rerun with option '-Wcounterexamples' to generate conflict counterexamples [-Wother]
Instead, let's attach it as a subnote of the diagnostic (in the
current case, -Wconflicts-sr):
warning: 2 shift/reduce conflicts [-Wconflicts-sr]
note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
* src/conflicts.c (conflicts_print): Do that.
Adjust the test suite.
Suggesting -Wcounterexamples when there are conflicts is probably not
what the user wants. If she knows her conflicts and has set
%expect/%expect-rr appropriately, we shouldn't warn.
The commit also swaps the counterexamples and the report of conflicts,
into, IMHO, a more natural order: from
Shift/reduce conflict on token B:
1: 3 a: A .
1: 8 y: A . B
Example A • B C
First derivation s ::=[ a ::=[ A • ] x ::=[ B C ] ]
Example A • B C
Second derivation s ::=[ y ::=[ A • B ] c ::=[ C ] ]
input.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
input.y:4.4: warning: rule useless in parser due to conflicts [-Wother]
to
input.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
Shift/reduce conflict on token B:
1: 3 a: A .
1: 8 y: A . B
Example A • B C
First derivation s ::=[ a ::=[ A • ] x ::=[ B C ] ]
Example A • B C
Second derivation s ::=[ y ::=[ A • B ] c ::=[ C ] ]
input.y:4.4: warning: rule useless in parser due to conflicts [-Wother]
* src/conflicts.c (rule_conflicts_print): Rename as...
(report_rule_expectation_mismatches): this.
Move the handling of report_counterexamples to...
(conflicts_print): Here.
Display this warning when applicable.
Currently, when we quote the source file, we indent it with one space,
and preserve tabulations, so there is a discrepancy and the visual
rendering is bad. One way out is to indent with a tab instead of a
space, but then this space can be used for more information. This is
what GCC9 does. Let's play copy cats.
See
https://lists.gnu.org/archive/html/bison-patches/2019-04/msg00025.htmlhttps://developers.redhat.com/blog/2019/03/08/usability-improvements-in-gcc-9/https://gcc.gnu.org/onlinedocs/gccint/Guidelines-for-Diagnostics.html#Guidelines-for-Diagnostics
* src/location.c (location_caret): Prefix quoted lines with the line
number and a pipe, fitting 8 columns.
* tests/actions.at, tests/c++.at, tests/conflicts.at,
* tests/diagnostics.at, tests/input.at, tests/java.at,
* tests/named-refs.at, tests/reduce.at, tests/regression.at,
* tests/sets.at: Adjust expectations.
Partly by "./build-aux/update-test tests/testsuite.dir/*/testsuite.log"
repeatedly, and partly by hand.
The format is inconsistent. For instance most sections are
indented (including "Terminals unused in grammar" for instance), but
the sections "Terminals, with rules where they appear" and
"Nonterminals, with rules where they appear" are not. Let's indent
them. Also, these two sections try to wrap the output to avoid lines
too long. Yet we don't do that in the rest of the file, for instance
when listing the lookaheads of an item.
For instance in the case of Bison's parse-gram.output we go from:
Terminals, with rules where they appear
"end of file" (0) 0
error (256) 28 88
"string" <char*> (258) 9 13 16 17 20 23 24 109 116
[...]
Nonterminals, with rules where they appear
$accept (58)
on left: 0
input (59)
on left: 1, on right: 0
prologue_declarations (60)
on left: 2 3, on right: 1 3
prologue_declaration (61)
on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24
25 26 27 28 29, on right: 3
[...]
to
Terminals, with rules where they appear
"end of file" (0) 0
error (256) 28 88
"string" <char*> (258) 9 13 16 17 20 23 24 109 116
[...]
Nonterminals, with rules where they appear
$accept (58)
on left: 0
input (59)
on left: 1
on right: 0
prologue_declarations (60)
on left: 2 3
on right: 1 3
prologue_declaration (61)
on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24 25 26 27 28 29
on right: 3
[...]
* src/print.c (END_TEST): Remove.
(print_terminal_symbols): Don't try to wrap the output.
(print_nonterminal_symbols): Likewise.
Make two different lines for occurrences on the left, and occurrence
on the rhs of the rules.
Indent by 4 and 8, not 3.
* src/reduce.c (reduce_output): Indent by 4, not 3.
* tests/conflicts.at, tests/existing.at, tests/reduce.at,
* tests/regression.at, tests/report.at:
Adjust.
* maint:
maint: post-release administrivia
version 3.3.2
style: minor fixes
NEWS: named constructors are preferable to symbol_type ctors
gram: fix handling of nterms in actions when some are unused
style: rename local variable
CI: update the ICC serial number for travis-ci.org
Since Bison 3.3, semantic values in rule actions (i.e., '$...') are
passed to the m4 backend as the symbol number. Unfortunately, when
there are unused symbols, the symbols are renumbered _after_ the
numbers were used in the rule actions. As a result, the evaluation of
the skeleton failed because it used non existing symbol numbers.
Which is the happy scenario: we could use numbers of other existing
symbols...
Reported by Balázs Scheidler.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00044.html
Translating the rule actions after the symbol renumbering moves too
many parts in bison. Relying on the symbol identifiers is more
troublesome than it might first seem: some don't have an
identifier (tokens with only a literal string), some might have a
complex one (tokens with a literal string with characters special for
M4). Well, these are tokens, but nterms also have issues: "dummy"
nterms (for midrule actions) are named $@32 etc. which is risky for
M4.
Instead, let's simply give M4 the mapping between the old numbers and
the new ones. To avoid confusion between old and new numbers, always
emit pre-renumbering numbers as "orig NUM".
* data/README: Give details about "orig NUM".
* data/skeletons/bison.m4 (__b4_symbol, _b4_symbol): Resolve the
"orig NUM".
* src/output.c (prepare_symbol_definitions): Pass nterm_map to m4.
* src/reduce.h, src/reduce.c (nterm_map): Extract it from
nonterminals_reduce, to make it public.
(reduce_free): Free it.
* src/scan-code.l (handle_action_dollar): When referring to a nterm,
use "orig NUM".
* tests/reduce.at (Useless Parts): New, based Balázs Scheidler's
report.
Currently on the following grammar:
%type <foo> foo
%%
start: foo | bar | "baz"
foo: foo
bar: bar
bison reports:
warning: 2 nonterminals useless in grammar [-Wother]
warning: 4 rules useless in grammar [-Wother]
1.13-15: warning: nonterminal useless in grammar: foo [-Wother]
%type <foo> foo
^^^
3.14-16: warning: nonterminal useless in grammar: bar [-Wother]
start: foo | bar | "baz"
^^^
[...]
i.e., the location of the first occurrence of a symbol is taken as its
definition point. In the case of nonterminals, the first occurrence
as a left-hand side of a rule makes more sense:
warning: 2 nonterminals useless in grammar [-Wother]
warning: 4 rules useless in grammar [-Wother]
4.1-3: warning: nonterminal useless in grammar: foo [-Wother]
foo: foo
^^^
5.1-3: warning: nonterminal useless in grammar: bar [-Wother]
bar: bar
^^^
[...]
* src/symtab.h, src/symtab.c (symbol::location_of_lhs): New.
(symbol_location_as_lhs_set): New.
* src/parse-gram.y (current_lhs): Use it.
* tests/reduce.at: Update locations.
In the following grammar, the 'exp' nonterminal is trivially useless.
So, of course, its rules are useless too.
%%
input: '0' | exp
exp: exp '+' exp | exp '-' exp | '(' exp ')'
Previously all the useless rules were reported, including those whose
left-hand side is the 'exp' nonterminal:
warning: 1 nonterminal useless in grammar [-Wother]
warning: 4 rules useless in grammar [-Wother]
2.14-16: warning: nonterminal useless in grammar: exp [-Wother]
input: '0' | exp
^^^
2.14-16: warning: rule useless in grammar [-Wother]
input: '0' | exp
^^^
! 3.6-16: warning: rule useless in grammar [-Wother]
! exp: exp '+' exp | exp '-' exp | '(' exp ')'
! ^^^^^^^^^^^
! 3.20-30: warning: rule useless in grammar [-Wother]
! exp: exp '+' exp | exp '-' exp | '(' exp ')'
! ^^^^^^^^^^^
! 3.34-44: warning: rule useless in grammar [-Wother]
! exp: exp '+' exp | exp '-' exp | '(' exp ')'
! ^^^^^^^^^^^
The interest of being so verbose is dubious. I suspect most of the
time nonterminals are not expected to be useless, so the user wants to
fix the nonterminal, not remove its rules. And even if the user
wanted to get rid of its rules, the position of these rules probably
does not help more that just having the name of the nonterminal.
This commit discard these messages, marked with '!', and keep the
others. In particular, we still report:
2.14-16: warning: rule useless in grammar [-Wother]
input: '0' | exp
^^^
All the useless rules (including the '!' ones) are still reported in
the reports (xml, text, etc.); only the diagnostics on stderr change.
* src/gram.c (grammar_rules_useless_report): Don't complain about
useless rules whose lhs is useless.
* src/reduce.h, src/reduce.c (reduce_nonterminal_useless_in_grammar):
Take a sym_content as argument.
Adjust callers.
* tests/reduce.at (Useless Rules, Underivable Rules, Reduced Automaton):
Adjust.
* src/gram.c (grammar_rules_useless_report): Let -fcaret handle the
pretty-printing of the guilty rules.
(rule_print): Inline in its only use.
* tests/conflicts.at, tests/existing.at, tests/reduce.at,
* tests/regression.at: Adjust.
* NEWS: Document.
* origin/maint:
misc: pacify the Tiny C Compiler
cpp: make the check of Flex version portable
misc: require getline
c++: support wide strings for file names
doc: document carets
tests: enhance existing tests with carets
errors: show carets
getargs: add support for --flags/-f
Conflicts:
doc/bison.texi
m4/.gitignore
src/complain.c
src/flex-scanner.h
src/getargs.c
src/getargs.h
src/gram.c
src/main.c
tests/headers.at
* tests/actions.at: Unset value.
* tests/conflicts.at: Rule useless due to conflicts.
* tests/input.at: Missing terminator, unexpected end of file, command line
redefinition of variable.
* tests/named-refs.at: Many errors.
* tests/reduce.at: Useless nonterminals and rules.
* tests/regression.at: Large token.
* origin/maint:
tests: close files in glr-regression
xml: match DOT output and xml2dot.xsl processing
xml: factor xslt space template
graph: fix a memory leak
xml: documentation
output: capitalize State
The current routines used to display s/r and r/r conflicts are both
inconvenient from the programmer point of view (they do not use the
warning infrastructure) and for the user (the messages are rather
terse, not necessarily pleasant to read, and because they don't use
the same routines, they look different).
It was due to the belief (dating back to the initial checked-in
version of Bison) that, at some point, POSIX Yacc mandated the format
for these messages. Today, the Open Group's manual page for Yacc,
<http://pubs.opengroup.org/onlinepubs/009695399/utilities/yacc.html>,
explicitly states that the format of these messages is unspecified.
See commit be7280480c and
<http://lists.gnu.org/archive/html/bison-patches/2002-12/msg00027.html>.
For a discussion on the chosen warning format, see
http://lists.gnu.org/archive/html/bison-patches/2012-09/msg00039.html
In an effort to factor the handling of errors and warnings, use the
Bison warning routines to report these messages.
* src/conflicts.c (conflicts_print): Rewrite with clearer sections
about S/R and then R/R conflicts.
(conflict_report): Remove, inlined in its sole
caller...
(conflicts_output): here.
* tests/conflicts.at, tests/existing.at, tests/glr-regression.at,
* tests/reduce.at, tests/regression.at: Adjust the expected results.
* NEWS: Update.
States that shift the error token do not have default reductions,
and GLR disables some default reductions, so "all" was a misnomer.
* doc/bison.texinfo (%define Summary): Update.
(Default Reductions): Update.
* src/print.c (print_reductions): Update.
* src/reader.c (prepare_percent_define_front_end_variables):
Update.
* src/tables.c (action_row): Update.
* tests/input.at (%define enum variables): Update.
* tests/reduce.at (%define lr.default-reductions): Update.
(cherry picked from commit d815ec4a62)
States that shift the error token do not have default reductions,
and GLR disables some default reductions, so "all" was a misnomer.
* doc/bison.texinfo (%define Summary): Update.
(Default Reductions): Update.
* src/print.c (print_reductions): Update.
* src/reader.c (prepare_percent_define_front_end_variables):
Update.
* src/tables.c (action_row): Update.
* tests/input.at (%define enum variables): Update.
* tests/reduce.at (%define lr.default-reductions): Update.
For now, just api.push-pull and lr.keep-unreachable-states.
Maintain old names for backward compatibility.
* NEWS (2.5): Document.
* data/c.m4 (b4_identification): Update comment.
* data/yacc.c: Update access.
* doc/bison.texinfo: Update.
* etc/bench.pl.in (bench_push_parser): Update use.
* src/files.c (tr): Move to...
* src/getargs.c, src/getargs.h (tr): ... here because I can't
think of a better place to expose it. My logic is that, for all
uses of tr so far, command-line arguments can be involved, and
getargs.h is already included.
* src/main.c (main): Update access.
* src/muscle_tab.c (muscle_percent_define_insert): Convert old
variable names to new variable names before assigning value.
* src/reader.c (reader): Update setting default.
* tests/calc.at: Update uses.
* tests/conflicts.at (Unreachable States After Conflict
Resolution): Update use.
* tests/input.at (%define enum variables): Update use.
(%define backward compatibility): New test group.
* tests/push.at: Update uses.
* tests/reduce.at: Update uses.
* tests/torture.at: Update uses.
(cherry picked from commit 812775a039)
Conflicts:
data/c.m4
etc/bench.pl.in
src/parse-gram.c
src/parse-gram.h
tests/conflicts.at
For now, just api.push-pull and lr.keep-unreachable-states.
Maintain old names for backward compatibility.
* NEWS (2.5): Document.
* data/c.m4 (b4_identification): Update comment.
* data/yacc.c: Update access.
* doc/bison.texinfo: Update.
* etc/bench.pl.in (bench_grammar): Update use.
* src/files.c (tr): Move to...
* src/getargs.c, src/getargs.h (tr): ... here because I can't
think of a better place to expose it. My logic is that, for all
uses of tr so far, command-line arguments can be involved, and
getargs.h is already included.
* src/main.c (main): Update access.
* src/muscle_tab.c (muscle_percent_define_insert): Convert old
variable names to new variable names before assigning value.
* src/reader.c (reader): Update setting default.
* tests/calc.at: Update uses.
* tests/conflicts.at (Unreachable States After Conflict
Resolution): Update use.
* tests/input.at (%define enum variables): Update use.
(%define backward compatibility): New test group.
* tests/push.at: Update uses.
* tests/reduce.at: Update uses.
* tests/torture.at: Update uses.