* src/lalr.c: Move logs to a better place to understand the chronology
of events.
* src/symlist.c (symbol_list_syms_print): Don't dump core on type
elements.
I never understood why we book ngotos+1 slots for relations between
gotos: there are at most ngotos images, not ngotos+1 (and "includes"
does have cases where a goto is in relation with itself, so it's not
ngotos-1).
Maybe bbf37f2534 explains the +1: a bug
left us register a goto several times on occasion, and the +1 might
have been a means to avoid this problem in most cases. Now that this
bug is addressed, we should no longer overbook memory, if only for the
clarity of the code ("why ngotos+1 instead of ngotos?").
* src/lalr.c: A goto has at most ngotos images, not ngotos+1.
While at it, avoid useless repeated call to map_goto introduced in
bbf37f2534.
The "includes" relation [DeRemer 1982] is between gotos, so of course,
for a given goto, there cannot be more that ngotos (number of gotos)
images. But we manipulate the set of images of a goto as a list,
without checking that an image was not already introduced. So we can
"register" way more images than ngotos, leading to a crash (heap
buffer overflow).
Reported by wcventure.
http://lists.gnu.org/archive/html/bug-bison/2019-03/msg00007.html
For the records, this bug is present in the first committed version of
Bison.
* src/lalr.c (build_relations): Don't insert the same goto several
times.
* tests/sets.at (Build Relations): New.
* src/lalr.c (follows_print): Just print the symbol tag.
Take and print a title.
Indent the output.
Use it to print the various steps of the computation.
(lookahead_tokens_print): Fix a lie: the number displayed is not the
number of tokens.
Don't display states that don't even have reductions.
* src/relation.h, src/relation.c (relation_print): Accept and use a
title.
Don't print empty rows.
Indent the output.
Adjust dependencies.
* src/lalr.c (initialize_goto_follows): Print 'reads' in traces.
Commit db34f79889 renames the variable F
as goto_follows, but forgot to rename this function.
* src/lalr.c (initialize_F): Rename as...
(initialize_goto_follows): this.
I need to be able to read includes and goto_follows.
* src/relation.h, src/relation.c (relation_print): Provide a means to
pretty-print the nodes of the relation.
* src/lalr.c (goto_print, follows_print): New.
(set_goto_map): Use goto_print.
(build_relations): Show INCLUDES.
(compute_FOLLOWS): Rename as...
(compute_follows): this.
Show FOLLOWS.
* src/lalr.c (build_relations): Reduce the scopes.
Instead of keeping rp alive in two different loops, clarify the second
one by having an index on the path we traverse (i.e., use that index
to compute the source state _and_ the symbol that labels the
transition).
This allows to turn an obscure 'while'-loop in a clearer (IMHO)
'for'-loop. We also consume more variables (by introducing p instead
of making more side effects on length), but we're in 2019, I don't
think this matters. What does matter is that (IMHO again), this is
now clearer.
Also, use clearer names.
* doc/bison.texinfo: Space change.
* src/system.h (STREQ, STRNEQ): New.
* src/files.c, src/ielr.c, src/lalr.c, src/muscle-tab.c,
* src/output.c, src/print.c, src/print_graph.c,
* src/reader.c, src/scan-skel.l, src/tables.c,
* src/uniqstr.c:
Use them.
* src/scan-gram.l: Do not use streq.h, use system.h's STREQ.
* cfg.mk: The documentation is an exception.
This change was made by applying emacs' untabify function to
nearly all files in Bison's repository. Required tabs in make
files, ChangeLog, regexps, and test code were manually skipped.
Other notable exceptions and changes are listed below.
* bootstrap: Skip because we sync this with gnulib.
* data/m4sugar/foreach.m4
* data/m4sugar/m4sugar.m4: Skip because we sync these with
Autoconf.
* djgpp: Skip because I don't know how to test djgpp properly, and
this code appears to be unmaintained anyway.
* README-hacking (Hacking): Specify that tabs should be avoided
where not required.
Stop equating LR(0) with nondeterminism and LALR(1) with
determinism. That is, if all states are consistent, then LR(0)
tables are deterministic. On the other hand, LALR(1) tables
might be nondeterministic before conflict resolution, and GLR
permits LALR(1) tables to remain nondeterministic.
* src/LR0.c, src/LR0.h: Here.
* src/lalr.c, src/lalr.h: Here.
* src/main.c (main): Here.
* src/state.c, src/state.h: Here.
* src/ielr.h (ielr): In preconditions, expect LR(0) not LALR(1)
parser tables.
(cherry picked from commit 1c4ad777cb)
Its value can be "LALR", "IELR", or "canonical LR".
* lib/timevar.def (TV_IELR_PHASE1): New var.
(TV_IELR_PHASE2): New var.
(TV_IELR_PHASE3): New var.
(TV_IELR_PHASE4): New var.
* src/local.mk (src_bison_SOURCES): Add AnnotationList.c,
AnnotationList.h, InadequacyList.c, InadequacyList.h, Sbitset.c,
Sbitset.h, ielr.c, and ielr.h.
* src/getargs.h, src/getargs.c (enum trace, trace_args,
trace_types): Add trace_ielr.
* src/lalr.h, src/lalr.c (ngotos): Export it.
(F): Rename to...
(goto_follows): ... this, update all uses, and export it.
(set_goto_map): Export it.
(map_goto): Export it.
(compute_lookahead_tokens): Don't free goto_follows yet. Now
handled in ielr.
(initialize_LA): Export it. Move lookback allocation to...
(lalr): ... here because, for canonical LR, initialize_LA must
be invoked but lookback and much of the rest of LALR isn't
needed.
* main.c (main): Instead of lalr, invoke ielr, which invokes
lalr.
* src/reader.c (reader): Default lr.type to "LALR".
Default lr.default_rules to "accepting" if lr.type is "canonical
LR". Leave the default as "all" otherwise.
Check for a valid lr.type value.
* src/state.h, src/state.c (struct state_list): Add state_list
member.
(state_new): Initialize state_list member to NULL.
(state_new_isocore): New function, exported.
* tests/existing.at (AT_TEST_EXISTING_GRAMMAR): New macro that
exercises all values of lr.type.
(GNU AWK Grammar): Rename test group to...
(GNU AWK 3.1.0 Grammar): ... this, and extend to use
AT_TEST_EXISTING_GRAMMAR.
(GNU Cim Grammar): Extend to use AT_TEST_EXISTING_GRAMMAR.
(GNU pic Grammar): Rename test group to...
(GNU pic (Groff 1.18.1) Grammar): ... this, and extend to use
AT_TEST_EXISTING_GRAMMAR.
* tests/reduce.at (AT_TEST_LR_TYPE): New macro that exercises
all values of lr.type.
(Single State Split): New test groups using AT_TEST_LR_TYPE.
(Lane Split): Likewise.
(Complex Lane Split): Likewise.
(Split During Added Lookahead Propagation): Likewise.
Its value describes the states that are permitted to contain
default rules: "all", "consistent", or "accepting".
* src/reader.c (reader): Default lr.default_rules to "all".
Check for a valid lr.default_rules value.
* src/lalr.c (state_lookahead_tokens_count): If lr.default_rules
is "accepting", then only mark the accepting state as
consistent.
(initialize_LA): Tell state_lookahead_tokens_count whether
lr.default_rules is "accepting".
* src/tables.c (action_row): If lr.default_rules is not "all",
then disable default rules in inconsistent states.
* src/print.c (print_reductions): Use this opportunity to
perform some assertions about whether lr.default_rules was
obeyed correctly.
* tests/local.at (AT_TEST_TABLES_AND_PARSE): New macro that
helps with checking the parser tables for a grammar.
* tests/input.at (%define lr.default_rules invalid values): New
test group.
* tests/reduce.at (AT_TEST_LR_DEFAULT_RULES): New macro using
AT_TEST_TABLES_AND_PARSE.
(`no %define lr.default_rules'): New test group generated by
AT_TEST_LR_DEFAULT_RULES.
(`%define lr.default_rules "all"'): Likewise.
(`%define lr.default_rules "consistent"'): Likewise.
(`%define lr.default_rules "accepting"'): Likewise.
move the check for disabled transitions to an aver since conflict
resolution hasn't happened yet.
* src/lalr.c (state_lookahead_tokens_count): Remove the check that
labels a state as inconsistent just because it has error transitions.
The original form of this check appeared in revision 1.1 of lalr.c,
which was committed on 1991-12-21. Now (at least), changing the
consistency label on such a state appears to have no useful effect in
any of the places it is examined, which I enumerate below. The key
point to understanding each item in this enumeration is that a state
with an error transition is labelled consistent in the first place only
if it has no rules, so the check cannot matter for states that have
rules. (1) Labelling a state as inconsistent will cause set_conflicts
to try to identify its conflicts, and a state must have *rules* to have
conflicts. (2) Labelling a state as inconsistent will affect how
action_row sets the default *rule* for the state. (3) Labelling a
state as inconsistent will cause build_relations to add lookback edges
to *rules* in that state.
* src/state.h (struct state): Word the comment for member consistent
more carefully.
* src/conflicts.c (conflicts_update_state_numbers): Fix for-loop.
* src/lalr.c (lalr_update_state_numbers): Fix for-loop.
* src/reader.c (check_and_convert_grammar): Fix for-loop.
* src/state.c (state_mark_reachable_states): Fix for-loop.
(state_remove_unreachable_states): Fix for-loop.
Don't widen struct state with member reachable just to temporarily
record reachability. Instead, use a local bitset.
* src/state.h (struct state): Remove member.
* src/state.c (state_new): Don't initialize it.
(state_mark_reachable_states): Rename to...
(state_record_reachable_states): ... this, and use bitset.
(state_remove_unreachable_states): Use bitset.
report rules that are then unused, and don't report conflicts in those
states.
* src/conflicts.c, src/conflicts.h (conflicts_update_state_numbers):
New global function.
* src/lalr.c, src/lalr.h (lalr_update_state_numbers): New global
function.
* src/main.c (main): After conflict resolution, remove the unreachable
states and update all data structures that reference states by number.
* src/state.c (state_new): Initialize each state's reachable member to
false.
(state_mark_reachable_states): New static function.
(state_remove_unreachable_states): New global function.
* src/state.h (struct state): Add member bool reachable.
(state_remove_unreachable_states): Prototype.
* tests/conflicts.at (Unreachable States After Conflict Resolution):
New test case.
* tests/existing.at (GNU pic Grammar): Update test case output now that
an unused rule is discovered.