Commit Graph

2426 Commits

Author SHA1 Message Date
Akim Demaille
d04962f788 style: eliminate useless indirection
* src/relation.h, src/relation.c (relation_digraph): Don't take the
biteetv as a pointer, it is already a pointer (as it's an array).
2019-02-25 06:19:55 +01:00
Akim Demaille
ec8142391a style: rename function for clarity
Commit db34f79889 renames the variable F
as goto_follows, but forgot to rename this function.

* src/lalr.c (initialize_F): Rename as...
(initialize_goto_follows): this.
2019-02-25 06:19:55 +01:00
Akim Demaille
59bec5fade lalr: more debug traces
I need to be able to read includes and goto_follows.

* src/relation.h, src/relation.c (relation_print): Provide a means to
pretty-print the nodes of the relation.
* src/lalr.c (goto_print, follows_print): New.
(set_goto_map): Use goto_print.
(build_relations): Show INCLUDES.
(compute_FOLLOWS): Rename as...
(compute_follows): this.
Show FOLLOWS.
2019-02-25 06:19:54 +01:00
Akim Demaille
5230e610fc style: minor changes
* examples/c/calc/calc.y, src/lalr.c: Reduce scope.
* src/gram.c: Prefer < to >.
2019-02-24 19:08:01 +01:00
Akim Demaille
b81419a9fd style: clarify the computation of the lookback edges
* src/lalr.c (build_relations): Reduce the scopes.
Instead of keeping rp alive in two different loops, clarify the second
one by having an index on the path we traverse (i.e., use that index
to compute the source state _and_ the symbol that labels the
transition).
This allows to turn an obscure 'while'-loop in a clearer (IMHO)
'for'-loop.  We also consume more variables (by introducing p instead
of making more side effects on length), but we're in 2019, I don't
think this matters.  What does matter is that (IMHO again), this is
now clearer.
Also, use clearer names.
2019-02-24 19:07:32 +01:00
Akim Demaille
2b9ee006d8 style: scope reduction in tables.c
* src/tables.c: here.
* src/lalr.c: Prefer < to >.
2019-02-24 12:00:44 +01:00
Akim Demaille
bd55d43333 graph: prefer *.gv to *.dot
Reported by Hans Åberg.
https://lists.gnu.org/archive/html/help-bison/2019-02/msg00064.html

* src/files.c (spec_graph_file): Use `*.gv` when 3.4 or better,
otherwise `*.dot`.
* src/parse-gram.y (handle_require): Pretend we are already 3.4.
* doc/bison.texi: Adjust.
* tests/local.at, tests/output.at: Exercise this.
2019-02-21 06:46:07 +01:00
Akim Demaille
d7ec136ffb style: move pkgdatadir to files.*
Let's move it to a more logical place.

* src/output.h, src/output.c (pkgdatadir): Move to...
* src/files.h, src/files.c: here.
2019-02-16 07:26:16 +01:00
Akim Demaille
dbdf2878ab style: rename cleanup_caret as caret_free
* src/location.c, src/location.h, src/main.c: here.
2019-02-14 18:53:01 +01:00
Akim Demaille
8654fca058 style: avoid default in switch on enums
* src/assoc.c (assoc_to_string): here.
2019-02-14 06:27:03 +01:00
Akim Demaille
fb83319d9c style: comment and names changes in map_goto
* src/lalr.h, src/lalr.c: Use clearer names.
2019-02-12 06:19:10 +01:00
Akim Demaille
ad7d8af6d1 style: factor printing of rules
* src/gram.h, src/gram.c (rule_print): New.
Use it.
2019-02-09 08:59:55 +01:00
Akim Demaille
f293345aa8 style: use lower case for variable names
* src/relation.c (INDEX, VERTICES): Rename as...
(indexes, vertices): these.
2019-02-09 08:58:12 +01:00
Akim Demaille
e18ad5a96b style: scope reduction in relation.c 2019-02-09 08:58:12 +01:00
Akim Demaille
dd232b95b7 report: stop counting uselessly
* src/print.c (print_nonterminal_symbols): Replace left_count and
right_count with on_left and on_right.
2019-02-09 08:23:50 +01:00
Akim Demaille
51861998c7 report: clean up its format
The format is inconsistent.  For instance most sections are
indented (including "Terminals unused in grammar" for instance), but
the sections "Terminals, with rules where they appear" and
"Nonterminals, with rules where they appear" are not.  Let's indent
them.  Also, these two sections try to wrap the output to avoid lines
too long.  Yet we don't do that in the rest of the file, for instance
when listing the lookaheads of an item.

For instance in the case of Bison's parse-gram.output we go from:

    Terminals, with rules where they appear

    "end of file" (0) 0
    error (256) 28 88
    "string" <char*> (258) 9 13 16 17 20 23 24 109 116
    [...]

    Nonterminals, with rules where they appear

    $accept (58)
        on left: 0
    input (59)
        on left: 1, on right: 0
    prologue_declarations (60)
        on left: 2 3, on right: 1 3
    prologue_declaration (61)
        on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24
        25 26 27 28 29, on right: 3
    [...]

to

    Terminals, with rules where they appear

    "end of file" (0) 0
    error (256) 28 88
    "string" <char*> (258) 9 13 16 17 20 23 24 109 116
    [...]

    Nonterminals, with rules where they appear

        $accept (58)
            on left: 0
        input (59)
            on left: 1
            on right: 0
        prologue_declarations (60)
            on left: 2 3
            on right: 1 3
        prologue_declaration (61)
            on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24 25 26 27 28 29
            on right: 3
    [...]

* src/print.c (END_TEST): Remove.
(print_terminal_symbols): Don't try to wrap the output.
(print_nonterminal_symbols): Likewise.
Make two different lines for occurrences on the left, and occurrence
on the rhs of the rules.
Indent by 4 and 8, not 3.
* src/reduce.c (reduce_output): Indent by 4, not 3.

* tests/conflicts.at, tests/existing.at, tests/reduce.at,
* tests/regression.at, tests/report.at:
Adjust.
2019-02-09 08:23:50 +01:00
Akim Demaille
e346210c03 add LR(0) output
This should not be used to generate parsers.  My point is actually to
facilitate debugging (when tweaking the generation of the LR(0)
automaton for instance, not carying -yet- about lookaheads).

* src/reader.c (prepare_percent_define_front_end_variables): Add lr(0).
* src/conflicts.c (set_conflicts): Be robust to reds not having
lookaheads at all.
* src/ielr.c (LrType, lr_type_get): Adjust.
(ielr): Implement support for LR(0).
* src/lalr.c (lalr_free): Don't free LA when it's not computed.
2019-02-05 19:02:09 +01:00
Akim Demaille
0d44f83fcc style: scope reduction in derives.c
* src/derives.c: here.
2019-02-05 08:45:52 +01:00
Akim Demaille
40b5f89ee0 style: comment changes and refactoring in state.c
* src/state.h, src/state.c: Comment changes.
(transitions_to): Take a state* as argument.
* src/lalr.h, src/lalr.c: Comment changes.
(initialize_F): Use clear variable names.
2019-02-05 08:45:52 +01:00
Akim Demaille
cf96d1b0af Merge branch maint
* maint:
  maint: post-release administrivia
  version 3.3.2
  style: minor fixes
  NEWS: named constructors are preferable to symbol_type ctors
  gram: fix handling of nterms in actions when some are unused
  style: rename local variable
  CI: update the ICC serial number for travis-ci.org
2019-02-03 15:23:54 +01:00
Akim Demaille
334cb8f222 style: minor fixes
* NEWS, src/reduce.c, src/reduce.h: Use 'nonterminal'.
Fix comments.
2019-02-03 14:42:22 +01:00
Akim Demaille
cacdfc2f6e gram: fix handling of nterms in actions when some are unused
Since Bison 3.3, semantic values in rule actions (i.e., '$...') are
passed to the m4 backend as the symbol number.  Unfortunately, when
there are unused symbols, the symbols are renumbered _after_ the
numbers were used in the rule actions.  As a result, the evaluation of
the skeleton failed because it used non existing symbol numbers.
Which is the happy scenario: we could use numbers of other existing
symbols...

Reported by Balázs Scheidler.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00044.html

Translating the rule actions after the symbol renumbering moves too
many parts in bison.  Relying on the symbol identifiers is more
troublesome than it might first seem: some don't have an
identifier (tokens with only a literal string), some might have a
complex one (tokens with a literal string with characters special for
M4).  Well, these are tokens, but nterms also have issues: "dummy"
nterms (for midrule actions) are named $@32 etc. which is risky for
M4.

Instead, let's simply give M4 the mapping between the old numbers and
the new ones.  To avoid confusion between old and new numbers, always
emit pre-renumbering numbers as "orig NUM".

* data/README: Give details about "orig NUM".
* data/skeletons/bison.m4 (__b4_symbol, _b4_symbol): Resolve the
"orig NUM".
* src/output.c (prepare_symbol_definitions): Pass nterm_map to m4.
* src/reduce.h, src/reduce.c (nterm_map): Extract it from
nonterminals_reduce, to make it public.
(reduce_free): Free it.
* src/scan-code.l (handle_action_dollar): When referring to a nterm,
use "orig NUM".
* tests/reduce.at (Useless Parts): New, based Balázs Scheidler's
report.
2019-02-03 10:05:53 +01:00
Akim Demaille
48429252c1 style: reduce scopes
* src/symlist.c (symbol_list_free): New.
2019-02-03 07:28:57 +01:00
Akim Demaille
d459a5b8e6 style: prefer snprintf to sprintf
* src/symtab.c (dummy_symbol_get): There's no need for the buffer to
be so big and static.
Use snprintf for safety.
2019-02-03 07:28:57 +01:00
Akim Demaille
9566232422 style: comment and name changes
* src/output.c (prepare_symbol_names): here.
* src/reader.c: Remove obsolete comment.
* src/scan-code.l: Use || for Boolean or.
2019-02-02 17:32:10 +01:00
Akim Demaille
dc654a925c style: comment changes
* src/reader.c, src/scan-code.l: here.
2019-02-02 17:32:04 +01:00
Akim Demaille
31788ed4c7 style: rename local variable
* src/reduce.c (nonterminals_reduce): Rename nontermmap as nterm_map.
We will expose it.
2019-02-02 16:37:25 +01:00
Akim Demaille
781d2b02de gram: detect and report (in debug traces) useless chain rules
A rule is a useless chain iff it's a chain (aka unit, or injection)
rule (i.e., the RHS has length 1), and it's useless (it has no used
defined semantic action).

* src/gram.h, src/gram.c (rule_useless_chain_p): New.
(grammar_dump): Report useless chain rules.
* tests/sets.at: Check the traces.
2019-01-30 07:08:09 +01:00
Akim Demaille
8b5fc2143f lr(0): more debug traces
* src/lr0.c (core_print, kernel_print): New.
Use them.
2019-01-30 07:08:09 +01:00
Akim Demaille
5670677cb6 lr(0): remove useless conditional
* src/lr0.c (new_itemsets): There's no harm in setting a Boolean
several times.
2019-01-30 07:08:08 +01:00
Akim Demaille
32b9dcecc7 style: sort includes and avoid assignments
* src/symtab.c: Sort includes.
* src/gram.c (grammar_rules_print_xml): Avoid assignments to define
'usefulness'.
2019-01-30 07:08:00 +01:00
Akim Demaille
ac12b725ea style: use item_rule
* src/print-graph.c, src/print-xml.c: here.
2019-01-30 07:06:48 +01:00
Akim Demaille
e1783bc686 gram: factor the printing of items and the computation of their rule
There are several places where we need to recover the rule from an
item, let's factor that into item_rule.  We also want to print items
in a nice way: we do it when generating the *output file, but it is
also useful in debug messages.

* src/gram.h, src/gram.c (item_rule, item_print): New.
* src/print.c (print_core): Use them.
* src/state.h, src/state.c: Propagate constness.
2019-01-30 07:06:48 +01:00
Akim Demaille
c4f143eb96 style: scope reduction in print-xml
* src/print-xml.c: here.
2019-01-30 07:06:48 +01:00
Akim Demaille
3075d96d44 style: comment changes
* src/lr0.c, src/state.c, src/state.h: here.
2019-01-28 07:00:23 +01:00
Akim Demaille
7fec997ecf closure: initialize it once for all
The memory allocated by 'closure' (and some data such as 'fderives')
is used to computed a state's full itemset from its core.  This is
needed during the construction of the LR(0) automaton, and the memory
is reclaimed immediately afterwards.

Unfortunately the reports (graph, text, xml) also need this
information when describing the states with their full itemsets.  As a
consequence the memory was allocated again, fderives computed again
too, and more --trace reports are generated which only duplicate what
was already reported.

Stop that.  It does mean that we release the memory later (hence the
peak memory usage is higher now), but I don't think that's a problem
today.

* src/lr0.c (generate_states): Don't call closure_free.
* src/state.c (states_free): Do it here.
(for symmetry with closure_new which is called in generate_states).
* src/print-graph.c, src/print-xml.c, src/print.c: You can now expect
the closure module to be functional.
2019-01-28 06:57:31 +01:00
Akim Demaille
7355a35e4b style: rename closure_* functions as closure_*
This is more consistent with the other files.

* closure.h, closure.c (new_closure, free_closure): Rename as...
(closure_new, closure_free): this.
Adjust dependencies.
2019-01-28 06:47:33 +01:00
Akim Demaille
e585377e68 lr0: use a bitset for the set of "shiftable symbols"
This will make it easier to add new elements (that might already be
part of shift_symbol) without having to worry about the size of
shift_symbol (which is currently a fixed size vector).

I could not measure any significant differences in performances in the
generation of LR(0) automaton (benched on gramamrs of Ruby, C, and C++).

* src/lr0.c (shift_symbol): Make it a bitset.
2019-01-28 06:47:33 +01:00
Akim Demaille
9cd7bd4d5f add -fsyntax-only
When debugging Bison itself, this is very handy, especially when
tweaking the frontend badly enough to break the backends. It can also
be used to check a grammar.

* src/getargs.h, src/getargs.c (feature_syntax_only): New.
(feature_args, feature_types): Adjust.
* src/main.c (main): Use it.
2019-01-28 06:47:07 +01:00
Akim Demaille
a108d84f88 style: beware of collisions on status
* src/symtab.h (status): Rename as...
(declaration_status): this, to avoid colliding with status, the
argument of 'usage'.
'status' seems a tad too general to be used only here.
2019-01-27 20:07:08 +01:00
Akim Demaille
d02ca923e2 usage: document -ffixit
* src/getargs.c (usage): Document -ffixit.
Document the aliases of -f.
2019-01-27 18:08:55 +01:00
Akim Demaille
f82f7eb1d8 style: reduce scopes in state.c and ielr.c 2019-01-27 18:08:47 +01:00
Akim Demaille
0d472b29ec Merge branch 'maint'
* maint:
  maint: post-release administrivia
  version 3.3.1
  yacc: issue warnings, not errors, for Bison extensions
  style: formatting changes in NEWS and complain.c
  tests: don't depend on the user's definition of SHELL
2019-01-27 16:44:56 +01:00
Akim Demaille
8b0b295569 yacc: issue warnings, not errors, for Bison extensions
Reported by Kiyoshi Kanazawa.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00029.html

* src/getargs.c (getargs): Let --yacc imply -Wyacc, not -Werror=yacc.
* tests/input.at: Adjust.
* doc/bison.tex (Bison Options): Document.
2019-01-27 15:53:28 +01:00
Akim Demaille
59a108c0a6 style: formatting changes in NEWS and complain.c 2019-01-27 15:51:44 +01:00
Akim Demaille
21a7fa8063 traces: always print the reduced grammar and fix it
* src/gram.c (grammar_dump): Print the effective number first instead
of last.  And fix it (remove the incorrect "+1").
Use t/f for Booleans.
* src/reduce.c: When asked, always print the reduced grammar, even if
there was nothing useless.
* tests/sets.at (Reduced Grammar): Check that.
2019-01-26 16:21:35 +01:00
Akim Demaille
83463dfbee style: rename LR0.* as lr0.*
Let's stick to lower case for file names.

* src/LR0.h, src/LR0.c: Rename as...
* src/lr0.h, src/lr0.c: these.
2019-01-26 16:21:35 +01:00
Akim Demaille
c3c50c0030 style: rename print_graph.* as print-graph.*
These are the only files with _.

* src/print_graph.h, src/print_graph.c: Rename as...
* src/print-graph.h, src/print-graph.c: these.
2019-01-26 16:16:47 +01:00
Akim Demaille
e85ab7ac9b style: various fixes
* src/gram.c: Use consistent variable names.
Prefix prefix unary operators.
(grammar_dump): Use rule_rhs_length instead of duplicating it.
* src/reduce.c: Avoid useless variables.
2019-01-26 16:16:47 +01:00
Akim Demaille
a11463a02f style: comment changes in gram.h
* src/gram.h: Shorten comments.
2019-01-26 16:16:47 +01:00