Commit Graph

6320 Commits

Author SHA1 Message Date
Akim Demaille
c06ec1f132 gnulib: update 2019-02-13 07:26:38 +01:00
Akim Demaille
fb83319d9c style: comment and names changes in map_goto
* src/lalr.h, src/lalr.c: Use clearer names.
2019-02-12 06:19:10 +01:00
Akim Demaille
e42a7a1862 yacc: support parse.assert
While hacking on the computation of the automaton, I had yystate being
equal to -1, and the parser loops.  Let's catch this when
parser.assert is enabled.

* data/skeletons/yacc.c (YY_ASSERT): New.
Use it.
Not using the name YYASSERT, to make it clear that this is private.
glr.c should probably move to YY_ASSERT too.
Also, while at it, report 'Entering state...' even before growing the
stacks.
2019-02-12 06:19:10 +01:00
Akim Demaille
8cbf3ce22c examples: depend on Bison's sources
* examples/c/calc/local.mk, examples/c/lexcalc/local.mk,
* examples/c/mfcalc/local.mk, examples/c/rpcalc/local.mk:
Regenerate the files if dependencies have changed.
2019-02-12 06:19:10 +01:00
Eric S. Raymond
1997093e21 README: point to README-hacking
* README (Build from git): New.
* README-hacking: Describe easier submodule update.
2019-02-12 06:19:10 +01:00
Akim Demaille
f23b879ff5 doc: a single space before closing comments
I don't think this style:

    /* If buffer is full, make it bigger.        */
    if (i == length)
      {
        length *= 2;
        symbuf = (char *) realloc (symbuf, length + 1);
      }
    /* Add this character to the buffer.         */
    symbuf[i++] = c;
    /* Get another character.                    */
    c = getchar ();

or more readable than

    /* If buffer is full, make it bigger. */
    if (i == length)
      {
        length *= 2;
        symbuf = (char *) realloc (symbuf, length + 1);
      }
    /* Add this character to the buffer. */
    symbuf[i++] = c;
    /* Get another character. */
    c = getchar ();

Actually, I think the latter is more readable, and helps with width
issues in the PDF.  As a matter of fact, I would happily move to //
only for single line comments.

* doc/bison.texi: A single space before closing comments.
2019-02-10 17:44:23 +01:00
Akim Demaille
8b2d233283 doc: modernize the examples
* doc/bison.texi: Prefer 'fun' to 'fnct'.
Reduce local variable scopes.
Prefer strdup to malloc + strcpy.
Avoid gratuitous casts.
Use simpler names (e.g., 'name' instead of 'fname').
Avoid uses of 0 for NULL.
Avoid using NULL when possible (e.g., 'p' instead of 'p != NULL').
Prefer union names to casts (e.g. 'yylval.VAR = s' instead of
'*((symrec**) &yylval) = s').
Give arguments a name in fun declarations.
Use our typedefs instead of duplicating them (func_t).
Stop promoting an explicit $$ = $1;, it should be implicit (Bison
might be able to eliminate useless chain rules).
Help a bit Texinfo by making smaller groups.
Rely on the C compiler to call function pointers (prefer
'$1->value.fun ($3)' to (*($1->value.fnctptr))($3)').
2019-02-10 17:44:23 +01:00
Akim Demaille
40fc688765 examples: add a simple infix calculator in C
Currently we have no simple example: rpcalc in reverse Polish, mfcalc
has functions, and lexcalc is using lex.

* examples/c/calc/Makefile, examples/c/calc/calc.y,
* examples/c/calc/calc.test, examples/c/calc/local.mk: New.
2019-02-10 17:44:23 +01:00
Akim Demaille
30f61b0549 examples: fix annoying off-by-one errors
* examples/extexi: Since we issue #lines only at the beginning of
@example, leave empty line when removing content (such as @comment
lines), otherwise the lines that follow have incorrect source line
location.  This leaves ugly empty lines, but they are removed when you
tidy the output for the end user: sequences of \n are mapped to at
most two sucessive \n.
2019-02-10 16:41:50 +01:00
Akim Demaille
ad7d8af6d1 style: factor printing of rules
* src/gram.h, src/gram.c (rule_print): New.
Use it.
2019-02-09 08:59:55 +01:00
Akim Demaille
f293345aa8 style: use lower case for variable names
* src/relation.c (INDEX, VERTICES): Rename as...
(indexes, vertices): these.
2019-02-09 08:58:12 +01:00
Akim Demaille
e18ad5a96b style: scope reduction in relation.c 2019-02-09 08:58:12 +01:00
Akim Demaille
dd232b95b7 report: stop counting uselessly
* src/print.c (print_nonterminal_symbols): Replace left_count and
right_count with on_left and on_right.
2019-02-09 08:23:50 +01:00
Akim Demaille
51861998c7 report: clean up its format
The format is inconsistent.  For instance most sections are
indented (including "Terminals unused in grammar" for instance), but
the sections "Terminals, with rules where they appear" and
"Nonterminals, with rules where they appear" are not.  Let's indent
them.  Also, these two sections try to wrap the output to avoid lines
too long.  Yet we don't do that in the rest of the file, for instance
when listing the lookaheads of an item.

For instance in the case of Bison's parse-gram.output we go from:

    Terminals, with rules where they appear

    "end of file" (0) 0
    error (256) 28 88
    "string" <char*> (258) 9 13 16 17 20 23 24 109 116
    [...]

    Nonterminals, with rules where they appear

    $accept (58)
        on left: 0
    input (59)
        on left: 1, on right: 0
    prologue_declarations (60)
        on left: 2 3, on right: 1 3
    prologue_declaration (61)
        on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24
        25 26 27 28 29, on right: 3
    [...]

to

    Terminals, with rules where they appear

    "end of file" (0) 0
    error (256) 28 88
    "string" <char*> (258) 9 13 16 17 20 23 24 109 116
    [...]

    Nonterminals, with rules where they appear

        $accept (58)
            on left: 0
        input (59)
            on left: 1
            on right: 0
        prologue_declarations (60)
            on left: 2 3
            on right: 1 3
        prologue_declaration (61)
            on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24 25 26 27 28 29
            on right: 3
    [...]

* src/print.c (END_TEST): Remove.
(print_terminal_symbols): Don't try to wrap the output.
(print_nonterminal_symbols): Likewise.
Make two different lines for occurrences on the left, and occurrence
on the rhs of the rules.
Indent by 4 and 8, not 3.
* src/reduce.c (reduce_output): Indent by 4, not 3.

* tests/conflicts.at, tests/existing.at, tests/reduce.at,
* tests/regression.at, tests/report.at:
Adjust.
2019-02-09 08:23:50 +01:00
Akim Demaille
e346210c03 add LR(0) output
This should not be used to generate parsers.  My point is actually to
facilitate debugging (when tweaking the generation of the LR(0)
automaton for instance, not carying -yet- about lookaheads).

* src/reader.c (prepare_percent_define_front_end_variables): Add lr(0).
* src/conflicts.c (set_conflicts): Be robust to reds not having
lookaheads at all.
* src/ielr.c (LrType, lr_type_get): Adjust.
(ielr): Implement support for LR(0).
* src/lalr.c (lalr_free): Don't free LA when it's not computed.
2019-02-05 19:02:09 +01:00
Akim Demaille
0d44f83fcc style: scope reduction in derives.c
* src/derives.c: here.
2019-02-05 08:45:52 +01:00
Akim Demaille
40b5f89ee0 style: comment changes and refactoring in state.c
* src/state.h, src/state.c: Comment changes.
(transitions_to): Take a state* as argument.
* src/lalr.h, src/lalr.c: Comment changes.
(initialize_F): Use clear variable names.
2019-02-05 08:45:52 +01:00
Akim Demaille
eed9550993 tests: fix typos
* tests/reduce.at: here.
2019-02-05 08:45:52 +01:00
Akim Demaille
cf96d1b0af Merge branch maint
* maint:
  maint: post-release administrivia
  version 3.3.2
  style: minor fixes
  NEWS: named constructors are preferable to symbol_type ctors
  gram: fix handling of nterms in actions when some are unused
  style: rename local variable
  CI: update the ICC serial number for travis-ci.org
2019-02-03 15:23:54 +01:00
Akim Demaille
3d25b52a10 maint: post-release administrivia
* NEWS: Add header line for next release.
* .prev-version: Record previous version.
* cfg.mk (old_NEWS_hash): Auto-update.
2019-02-03 14:56:05 +01:00
Akim Demaille
437f6250c5 version 3.3.2
* NEWS: Record release date.
v3.3.2
2019-02-03 14:42:30 +01:00
Akim Demaille
334cb8f222 style: minor fixes
* NEWS, src/reduce.c, src/reduce.h: Use 'nonterminal'.
Fix comments.
2019-02-03 14:42:22 +01:00
Akim Demaille
03878edf77 NEWS: named constructors are preferable to symbol_type ctors
Reported by Frank Heckenbach.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00043.html
2019-02-03 10:06:25 +01:00
Akim Demaille
cacdfc2f6e gram: fix handling of nterms in actions when some are unused
Since Bison 3.3, semantic values in rule actions (i.e., '$...') are
passed to the m4 backend as the symbol number.  Unfortunately, when
there are unused symbols, the symbols are renumbered _after_ the
numbers were used in the rule actions.  As a result, the evaluation of
the skeleton failed because it used non existing symbol numbers.
Which is the happy scenario: we could use numbers of other existing
symbols...

Reported by Balázs Scheidler.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00044.html

Translating the rule actions after the symbol renumbering moves too
many parts in bison.  Relying on the symbol identifiers is more
troublesome than it might first seem: some don't have an
identifier (tokens with only a literal string), some might have a
complex one (tokens with a literal string with characters special for
M4).  Well, these are tokens, but nterms also have issues: "dummy"
nterms (for midrule actions) are named $@32 etc. which is risky for
M4.

Instead, let's simply give M4 the mapping between the old numbers and
the new ones.  To avoid confusion between old and new numbers, always
emit pre-renumbering numbers as "orig NUM".

* data/README: Give details about "orig NUM".
* data/skeletons/bison.m4 (__b4_symbol, _b4_symbol): Resolve the
"orig NUM".
* src/output.c (prepare_symbol_definitions): Pass nterm_map to m4.
* src/reduce.h, src/reduce.c (nterm_map): Extract it from
nonterminals_reduce, to make it public.
(reduce_free): Free it.
* src/scan-code.l (handle_action_dollar): When referring to a nterm,
use "orig NUM".
* tests/reduce.at (Useless Parts): New, based Balázs Scheidler's
report.
2019-02-03 10:05:53 +01:00
Akim Demaille
56c00ed1ea tests: strengthen some of them
* tests/reduce.at: Check that the generated parsers are proper C.
2019-02-03 08:00:23 +01:00
Akim Demaille
513d2f723f package: rename data/README as data/README.md
So that it is properly rendered by online git services.
2019-02-03 07:28:57 +01:00
Akim Demaille
48429252c1 style: reduce scopes
* src/symlist.c (symbol_list_free): New.
2019-02-03 07:28:57 +01:00
Akim Demaille
d459a5b8e6 style: prefer snprintf to sprintf
* src/symtab.c (dummy_symbol_get): There's no need for the buffer to
be so big and static.
Use snprintf for safety.
2019-02-03 07:28:57 +01:00
Akim Demaille
9566232422 style: comment and name changes
* src/output.c (prepare_symbol_names): here.
* src/reader.c: Remove obsolete comment.
* src/scan-code.l: Use || for Boolean or.
2019-02-02 17:32:10 +01:00
Akim Demaille
dc654a925c style: comment changes
* src/reader.c, src/scan-code.l: here.
2019-02-02 17:32:04 +01:00
Akim Demaille
76366e8e5c make: regenerate the example parsers when bison changes
* Makefile.am (dependencies): Also depend on Bison's sources.
2019-02-02 17:31:58 +01:00
Akim Demaille
31788ed4c7 style: rename local variable
* src/reduce.c (nonterminals_reduce): Rename nontermmap as nterm_map.
We will expose it.
2019-02-02 16:37:25 +01:00
Akim Demaille
781d2b02de gram: detect and report (in debug traces) useless chain rules
A rule is a useless chain iff it's a chain (aka unit, or injection)
rule (i.e., the RHS has length 1), and it's useless (it has no used
defined semantic action).

* src/gram.h, src/gram.c (rule_useless_chain_p): New.
(grammar_dump): Report useless chain rules.
* tests/sets.at: Check the traces.
2019-01-30 07:08:09 +01:00
Akim Demaille
8b5fc2143f lr(0): more debug traces
* src/lr0.c (core_print, kernel_print): New.
Use them.
2019-01-30 07:08:09 +01:00
Akim Demaille
5670677cb6 lr(0): remove useless conditional
* src/lr0.c (new_itemsets): There's no harm in setting a Boolean
several times.
2019-01-30 07:08:08 +01:00
Akim Demaille
32b9dcecc7 style: sort includes and avoid assignments
* src/symtab.c: Sort includes.
* src/gram.c (grammar_rules_print_xml): Avoid assignments to define
'usefulness'.
2019-01-30 07:08:00 +01:00
Akim Demaille
ac12b725ea style: use item_rule
* src/print-graph.c, src/print-xml.c: here.
2019-01-30 07:06:48 +01:00
Akim Demaille
e1783bc686 gram: factor the printing of items and the computation of their rule
There are several places where we need to recover the rule from an
item, let's factor that into item_rule.  We also want to print items
in a nice way: we do it when generating the *output file, but it is
also useful in debug messages.

* src/gram.h, src/gram.c (item_rule, item_print): New.
* src/print.c (print_core): Use them.
* src/state.h, src/state.c: Propagate constness.
2019-01-30 07:06:48 +01:00
Akim Demaille
c4f143eb96 style: scope reduction in print-xml
* src/print-xml.c: here.
2019-01-30 07:06:48 +01:00
Akim Demaille
94054924a9 tests: check XML and dot reports
* tests/report.at: Here.
2019-01-30 07:06:48 +01:00
Akim Demaille
c639611002 CI: update the ICC serial number for travis-ci.org
On travis-ci.org, there are five concurrent slaves, instead of three
on travis-ci.com.
2019-01-28 19:53:25 +01:00
Akim Demaille
69061fed82 CI: update the ICC serial number for travis-ci.org
On travis-ci.org, there are five concurrent slaves, instead of three
on travis-ci.com.
2019-01-28 08:12:13 +01:00
Akim Demaille
3075d96d44 style: comment changes
* src/lr0.c, src/state.c, src/state.h: here.
2019-01-28 07:00:23 +01:00
Akim Demaille
7fec997ecf closure: initialize it once for all
The memory allocated by 'closure' (and some data such as 'fderives')
is used to computed a state's full itemset from its core.  This is
needed during the construction of the LR(0) automaton, and the memory
is reclaimed immediately afterwards.

Unfortunately the reports (graph, text, xml) also need this
information when describing the states with their full itemsets.  As a
consequence the memory was allocated again, fderives computed again
too, and more --trace reports are generated which only duplicate what
was already reported.

Stop that.  It does mean that we release the memory later (hence the
peak memory usage is higher now), but I don't think that's a problem
today.

* src/lr0.c (generate_states): Don't call closure_free.
* src/state.c (states_free): Do it here.
(for symmetry with closure_new which is called in generate_states).
* src/print-graph.c, src/print-xml.c, src/print.c: You can now expect
the closure module to be functional.
2019-01-28 06:57:31 +01:00
Akim Demaille
7355a35e4b style: rename closure_* functions as closure_*
This is more consistent with the other files.

* closure.h, closure.c (new_closure, free_closure): Rename as...
(closure_new, closure_free): this.
Adjust dependencies.
2019-01-28 06:47:33 +01:00
Akim Demaille
e585377e68 lr0: use a bitset for the set of "shiftable symbols"
This will make it easier to add new elements (that might already be
part of shift_symbol) without having to worry about the size of
shift_symbol (which is currently a fixed size vector).

I could not measure any significant differences in performances in the
generation of LR(0) automaton (benched on gramamrs of Ruby, C, and C++).

* src/lr0.c (shift_symbol): Make it a bitset.
2019-01-28 06:47:33 +01:00
Akim Demaille
9cd7bd4d5f add -fsyntax-only
When debugging Bison itself, this is very handy, especially when
tweaking the frontend badly enough to break the backends. It can also
be used to check a grammar.

* src/getargs.h, src/getargs.c (feature_syntax_only): New.
(feature_args, feature_types): Adjust.
* src/main.c (main): Use it.
2019-01-28 06:47:07 +01:00
Akim Demaille
a108d84f88 style: beware of collisions on status
* src/symtab.h (status): Rename as...
(declaration_status): this, to avoid colliding with status, the
argument of 'usage'.
'status' seems a tad too general to be used only here.
2019-01-27 20:07:08 +01:00
Akim Demaille
1e83dd2229 gnulib: update 2019-01-27 19:48:09 +01:00
Akim Demaille
d02ca923e2 usage: document -ffixit
* src/getargs.c (usage): Document -ffixit.
Document the aliases of -f.
2019-01-27 18:08:55 +01:00