Commit Graph

2323 Commits

Author SHA1 Message Date
Akim Demaille
d68f05d75c style: s/non-terminal/nonterminal/
I personally prefer 'non terminal', or 'non-terminal', but
'nonterminal' is the common spelling.

* data/glr.c, src/parse-gram.y, src/symtab.c, src/symtab.h,
* tests/input.at, doc/refcard.tex: here.
2018-12-11 06:55:41 +01:00
Akim Demaille
b05aa7be2e style: rename error functions for clarity
* src/symtab.c (symbol_redeclaration, semantic_type_redeclaration)
(user_token_number_redeclaration):
Rename as...
(complain_symbol_redeclared, complain_semantic_type_redeclared)
(complain_user_token_number_redeclared):
this.
2018-12-11 06:55:35 +01:00
Akim Demaille
20b0746793 parser: improve the error message for symbol class redefinition
Currently our error messages include both "symbol redeclared" and
"symbol redefined", and they mean something different.  This is
obscure, let's make this clearer.

I think the idea between 'definition' vs. 'declaration' is that in the
case of the nonterminals, the actual definition is its set of rules,
so %nterm would be about declaration.  The case of %token is less
clear.

* src/symtab.c (complain_class_redefined): New.
(symbol_class_set): Use it.
Simplify the logic of this function to clearly skip its body when the
preconditions are not met.
* tests/input.at (Symbol class redefinition): New.
2018-12-11 06:53:25 +01:00
Akim Demaille
4cbdcaa572 regen 2018-12-09 13:55:05 +01:00
Akim Demaille
1e6a68858a regen 2018-12-09 12:50:53 +01:00
Akim Demaille
17730b0287 parser: minor refactoring
* src/parse-gram.y (symbol.prec): Reuse int.opt.
2018-12-09 12:50:53 +01:00
Akim Demaille
157f12c483 parser: move checks inside the called functions
Revamping the handling of the symbols is the grammar is much more
delicate than I anticipated.  Let's first move things around for
clarity.

* src/symtab.c (symbol_make_alias): Don't accept to alias
non-terminals.
(symbol_user_token_number_set): Don't accept user token numbers
for non-terminals.
Don't do anything in case of redefinition, instead of trying to
update.  The flow is eaier to follow this way.
2018-12-09 12:50:53 +01:00
Akim Demaille
401afe5cc2 parser: fix incorrect condition to raise a syntax error
* src/parse-gram.y (symbol_def): Fix test.
2018-12-06 17:50:54 +01:00
Akim Demaille
156140dfc3 style: scope reduction in ielr.c
* src/ielr.c: here.
2018-12-05 07:12:12 +01:00
Akim Demaille
4176584062 style: scope reduction in lalr.c
* src/lalr.c: here.
2018-12-05 06:49:06 +01:00
Akim Demaille
22b2c286ff d: add experimental support for the D language
* configure.ac (ENABLE_D): New.
* src/getargs.c (valid_languages): Add d.
2018-12-04 20:29:33 +01:00
Akim Demaille
f539a56620 regen 2018-12-03 18:42:00 +01:00
Akim Demaille
c44a782a4e backend: revamp the handling of symbol types
Currently it is the front end that passes the symbol types to the
backend.  For instance:

  %token <ival> NUM
  %type <ival> exp1 exp2
  exp1: NUM { $$ = $1; }
  exp2: NUM { $<ival>$ = $<ival>1; }

In both cases, $$ and $1 are passed to the backend as having type
'ival' resulting in code like `val.ival`.  This is troublesome in the
case of api.value.type=union, since in that the case the code this:

  %define api.value.type union
  %token <int> NUM
  %type <int> exp1 exp2
  exp1: NUM { $$ = $1; }
  exp2: NUM { $<int>$ = $<int>1; }

because in this case, since the backend does not know the symbol being
processed, it is forced to generate casts in both cases: *(int*)(&val)`.
This is unfortunate in the first case (exp1) where there is no reason
at all to use a cast instead of `val.NUM` and `val.exp1`.

So instead delegate the computation of the actual value type to the
backend: pass $<ival>$ as `symbol-number, ival` and $$ as
`symbol-number, MULL`, instead of passing `ival` before.

* src/scan-code.l (handle_action_dollar): Find the symbol the action
is about, not just its tyye.  Pass both symbol-number, and explicit
type tag ($<tag>n when there is one) to b4_lhs_value and b4_rhs_value.

* data/bison.m4 (b4_symbol_action): adjust to the new signature to
b4_dollar_pushdef.

* data/c-like.m4 (_b4_dollar_dollar, b4_dollar_pushdef): Accept the
symbol-number as new argument.

* data/c.m4 (b4_symbol_value): Accept the symbol-number as new
argument, and use it.
(b4_symbol_value_union): Accept the symbol-number as new
argument, and use it to prefer ready a union member rather than
casting the union.
* data/yacc.c (b4_lhs_value, b4_rhs_value): Accept the new
symbol-number argument.
Adjust uses of b4_dollar_pushdef.
* data/glr.c (b4_lhs_value, b4_rhs_value): Adjust.

* data/lalr1.cc (b4_symbol_value_template, b4_lhs_value): Adjust
to the new symbol-number argument.
* data/variant.hh (b4_symbol_value, b4_symbol_value_template): Accept
the new symbol-number argument.

* data/java.m4 (b4_symbol_value, b4_rhs_data): New.
(b4_rhs_value): Use them.
* data/lalr1.java: Adjust to b4_dollar_pushdef, and use b4_rhs_data.
2018-12-03 18:40:26 +01:00
Akim Demaille
e40db8976c style: comment and formatting changes
* data/bison.m4, data/c++.m4, data/glr.c, data/java.m4, data/lalr1.cc,
* data/yacc.c, src/scan-code.l:
Fix comments.
Prefer POS to denote the position of a symbol in a rule, since NUM
is also used to denote symbol numbers.
2018-12-03 08:42:26 +01:00
Akim Demaille
3422ee7435 style: unsigned int -> unsigned
See
https://lists.gnu.org/archive/html/bison-patches/2018-08/msg00027.html

* src/output.c (muscle_insert_unsigned_int_table): Rename as...
(muscle_insert_unsigned_table): this.
2018-12-01 11:13:08 +01:00
Akim Demaille
e1094c4c09 output: restore yyrhs and yyprhs
This was demanded several times.  See for instance:

- David M. Warme
  https://lists.gnu.org/archive/html/help-bison/2011-04/msg00003.html

- box12009
  http://lists.gnu.org/archive/html/bug-bison/2016-10/msg00001.html

Basically, this reverts:

- commit 3d3bc1fe30
  Get rid of (yy)rhs and (yy)prhs

- commit d333175f63
  Avoid compiler warning.

Note that since these tables are not needed in the generated parsers,
no skeleton requests them.  This change only brings back their
definition to M4, making it possible to user-defined skeletons to use
these tables.

* src/output.c (muscle_insert_item_number_table): Define.
(prepare_rules): Generate the rhs and prhs tables.
2018-12-01 11:12:59 +01:00
Akim Demaille
060da943bd regen 2018-11-30 06:10:21 +01:00
Akim Demaille
b7577ea6f6 parser: shorten side-effects on current_type
* src/parse-gram.y (tag.opt): Don't change current_type.
Rather, return its value.
Adjust dependencies.
2018-11-30 06:07:56 +01:00
Akim Demaille
6220e96e76 style: reduce scopes
* src/symlist.c: here.
2018-11-30 06:04:03 +01:00
Akim Demaille
b1d6c42ae5 regen 2018-11-29 06:16:20 +01:00
Akim Demaille
8e092082cb parser: factor the symbol definition
* src/parse-gram.y (int.opt, string_as_id.opt): New.
(symbol_def): Use it.
2018-11-29 06:16:20 +01:00
Akim Demaille
2c5e933672 parser: improve location of string alias errors
* src/parse-gram.y (symbol_def): Pass the right location for symbol_make_alias.
* tests/regression.at (Duplicate string): Move to...
* tests/input.at: here.
(Token collisions): New.
2018-11-29 06:16:20 +01:00
Akim Demaille
d92ed9d9f7 diagnostics: complain about Bison directives when -Wyacc
* src/complain.h, src/complain.c (bison_directive): New.
* src/scan-gram.l (BISON_DIRECTIVE): New.
Use it for Bison extensions.
2018-11-29 06:16:20 +01:00
Akim Demaille
0e9eade009 regen 2018-11-27 08:32:49 +01:00
Akim Demaille
9686b585e7 %nterm: do not accept character literals
Reported by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html

* src/complain.h: Formatting change.
* src/parse-gram.y (id): Reject character literals used in a context
for non-terminals.
* tests/input.at (Invalid %nterm uses): Check that.
2018-11-27 08:25:38 +01:00
Akim Demaille
4bddd33439 %nterm: do not accept numbers nor string alias
Reported by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html

* src/parse-gram.y (symbol_def): Refuse string aliases and numbers
for non-terminals.
(prologue_declaration): Recover from errors ended with ';'.
* tests/input.at (Invalid %nterm uses): New.
2018-11-27 08:25:38 +01:00
Akim Demaille
bcecfbafab gnulib: update to use its bitsets
Bison's bitset were moved to gnulib.

* lib/abitset.c, lib/abitset.h, lib/bbitset.h, lib/bitset.c,
* lib/bitset.h, lib/ebitset.c, lib/ebitset.h, lib/lbitset.c,
* lib/bitset_stats.c, lib/bitset_stats.h, lib/bitsetv-print.c,
* lib/bitsetv-print.h, lib/bitsetv.c, lib/bitsetv.h,
* lib/lbitset.h, lib/vbitset.c, lib/vbitset.h:
Remove.

* gnulib: Update.
* bootstrap.conf, lib/local.mk: Adjust.
2018-11-26 06:33:45 +01:00
Akim Demaille
9ffed56cd9 regen 2018-11-25 11:27:08 +01:00
Akim Demaille
7ded5bb764 %expect-rr: tune the number of conflicts per rule
Currently on a grammar such as

    exp : a '1' | a '2' | a '3' | b '1' | b '2' | b '3'
    a:
    b:

we count only one rr-conflict on the `b:` rule, i.e., we expect:

    b: %expect-rr 1

although there are 3 conflicts in total.  That's because in the
conflicted state we count only a single conflict, not three (one for
each of the lookaheads: '1', '2', '3').

    State 0

        0 $accept: . exp $end
        1 exp: . a '1'
        2    | . a '2'
        3    | . a '3'
        4    | . b '1'
        5    | . b '2'
        6    | . b '3'
        7 a: . %empty  ['1', '2', '3']
        8 b: . %empty  ['1', '2', '3']

        '1'       reduce using rule 7 (a)
        '1'       [reduce using rule 8 (b)]
        '2'       reduce using rule 7 (a)
        '2'       [reduce using rule 8 (b)]
        '3'       reduce using rule 7 (a)
        '3'       [reduce using rule 8 (b)]
        $default  reduce using rule 7 (a)

        exp  go to state 1
        a    go to state 2
        b    go to state 3

See https://lists.gnu.org/archive/html/bison-patches/2013-02/msg00106.html.

* src/conflicts.c (rule_has_state_rr_conflicts): Rename as...
(count_rule_state_sr_conflicts): this.
DWIM.
(count_rule_rr_conflicts): Adjust.
* tests/conflicts.at (%expect-rr in grammar rules)
(%expect-rr too much in grammar rules)
(%expect-rr not enough in grammar rules): New.
2018-11-22 08:34:10 +01:00
Akim Demaille
ad0b4661d1 %expect-rr: fix the computation of the overall number of conflicts
On a grammar such as

   exp: "num" | "num" | "num"

we currently report only one RR conflict, instead of two.

This bug is present since the origins of Bison

    commit 08089d5d35
    Author: David MacKenzie <djm@djmnet.org>
    Date:   Tue Apr 20 05:42:52 1993 +0000

       Initial revision

and was preserved in

    commit 676385e29c
    Author: Paul Hilfinger <Hilfinger@CS.Berkeley.EDU>
    Date:   Fri Jun 28 02:26:44 2002 +0000

       Initial check-in introducing experimental GLR parsing.  See entry in
       ChangeLog dated 2002-06-27 from Paul Hilfinger for details.

See
https://lists.gnu.org/archive/html/bison-patches/2018-11/msg00011.html

* src/conflicts.h, src/conflicts.c (count_state_rr_conflicts)
(count_rr_conflicts): Use only the correct count of conflicts.
* tests/glr-regression.at: Fix expectations.
2018-11-22 08:34:07 +01:00
Akim Demaille
e51fd547ca %expect: tune the number of conflicts per rule
Currently on a grammar such as

    exp: "number" | exp "+" exp | exp "*" exp

we count only one sr-conflict for both binary rules, i.e., we expect:

    exp: "number" | exp "+" exp  %expect 1 | exp "*" exp  %expect 1

although there are 4 conflicts in total.  That's because in the states
in conflict, for instance that for the "+" rule:

    State 6

        2 exp: exp . "+" exp
        2    | exp "+" exp .  [$end, "+", "*"]
        3    | exp . "*" exp

        "+"  shift, and go to state 4
        "*"  shift, and go to state 5

        "+"       [reduce using rule 2 (exp)]
        "*"       [reduce using rule 2 (exp)]
        $default  reduce using rule 2 (exp)

we count only a single conflict, although there are two (one on "+"
and another with "*").

See https://lists.gnu.org/archive/html/bison-patches/2013-02/msg00106.html.

* src/conflicts.c (rule_has_state_sr_conflicts): Rename as...
(count_rule_state_sr_conflicts): this.
DWIM.
(count_rule_sr_conflicts): Adjust.
* tests/conflicts.at (%expect in grammar rules): New.
2018-11-21 22:10:35 +01:00
Akim Demaille
4ebebcc438 regen 2018-11-21 22:10:35 +01:00
Akim Demaille
2b2556b41c style: reduce scopes
* src/conflicts.c, src/reader.c: Minor style changes.
2018-11-21 22:08:47 +01:00
Paul Hilfinger
b34b12c4f9 allow %expect and %expect-rr modifiers on individual rules
This change allows one to document (and check) which rules participate
in shift/reduce and reduce/reduce conflicts.  This is particularly
important GLR parsers, where conflicts are a normal occurrence.  For
example,

    %glr-parser
    %expect 1
    %%

    ...

    argument_list:
      arguments %expect 1
    | arguments ','
    | %empty
    ;

    arguments:
      expression
    | argument_list ',' expression
    ;

    ...

Looking at the output from -v, one can see that the shift-reduce
conflict here is due to the fact that the parser does not know whether
to reduce arguments to argument_list until it sees the token AFTER the
following ','.  By marking the rule with %expect 1 (because there is a
conflict in one state), we document the source of the 1 overall shift-
reduce conflict.

In GLR parsers, we can use %expect-rr in a rule for reduce/reduce
conflicts.  In this case, we mark each of the conflicting rules.  For
example,

    %glr-parser
    %expect-rr 1

    %%

    stmt:
      target_list '=' expr ';'
    | expr_list ';'
    ;

    target_list:
      target
    | target ',' target_list
    ;

    target:
      ID %expect-rr 1
    ;

    expr_list:
      expr
    | expr ',' expr_list
    ;

    expr:
      ID %expect-rr 1
    | ...
    ;

In a statement such as

    x, y = 3, 4;

the parser must reduce x to a target or an expr, but does not know
which until it sees the '='.  So we notate the two possible reductions
to indicate that each conflicts in one rule.

See https://lists.gnu.org/archive/html/bison-patches/2013-02/msg00105.html.

* doc/bison.texi (Suppressing Conflict Warnings): Document %expect,
%expect-rr in grammar rules.
* src/conflicts.c (count_state_rr_conflicts): Adjust comment.
(rule_has_state_sr_conflicts): New static function.
(count_rule_sr_conflicts): New static function.
(rule_nast_state_rr_conflicts): New static function.
(count_rule_rr_conflicts): New static function.
(rule_conflicts_print): New static function.
(conflicts_print): Also use rule_conflicts_print to report on individual
rules.
* src/gram.h (struct rule): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* src/reader.c (grammar_midrule_action): Transfer expected_sr_conflicts,
expected_rr_conflicts to new rule, and turn off in current_rule.
(grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
(packgram): Transfer expected_sr_conflicts, expected_rr_conflicts
to new rule.
* src/reader.h (grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
* src/symlist.c (symbol_list_sym_new): Initialize expected_sr_conflicts,
expected_rr_conflicts.
* src/symlist.h (struct symbol_list): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* tests/conflicts.at: Add tests "%expect in grammar rule not enough",
"%expect in grammar rule right.", "%expect in grammar rule too much."
2018-11-21 22:08:47 +01:00
Akim Demaille
ebb92c0545 regen 2018-11-20 20:04:06 +01:00
Akim Demaille
e0de1020ea style: avoid lengthy actions
We also lack a consistent naming for directive implementations.
`directive_skeleton` is too long, `percent_skeleton` is not very nice
looking, `process_skeleton` looks ambiguous, `do_skeleton` is somewhat
ambiguous too, but seems a better track.

* src/parse-gram.y (version_check): Rename as...
(do_require): this.
(do_skeleton): New.
Use it.
2018-11-20 20:03:01 +01:00
Akim Demaille
a52723e3e8 style: formatting changes
* src/scan-gram.l: here.
2018-11-13 07:46:08 +01:00
Akim Demaille
4810ed8107 regen 2018-11-12 07:41:46 +01:00
Akim Demaille
35b8e0e947 parser: deprecate %error-verbose
It is unfortunate that %error_verbose was properly diagnosed as
obsoleted by "%define parse.error verbose", but %error-verbose was
not.

* src/parse-gram.y (%error-verbose): Remove support.
* src/scan-gram.l: Do it here instead, with a warning.
* tests/input.at (Deprecated directives): Check it.
2018-11-12 07:41:46 +01:00
Akim Demaille
7928c3e6fb parser: deprecate %nterm
It has several weaknesses.
Reported by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html

* src/scan-gram.l: here.
2018-11-12 07:28:20 +01:00
Akim Demaille
3d601616da regen 2018-11-10 17:03:36 +01:00
Akim Demaille
bda2bed459 reader: no longer accept %define variable names in quotes
It was never documented.

* src/parse-gram.y (variable): Here.
2018-11-10 17:02:50 +01:00
Akim Demaille
3ae81aa338 dogfooding: use api.value.type union
* src/parse-gram.y (api.value.type): Set to union.
Replace occurrences of %union with explicit %types.
* src/scan-gram.l: Adjust yylval's field names.
(RETURN_VALUE): No longer needs the Field argument.
Use it more.
2018-11-10 17:02:50 +01:00
Akim Demaille
eee37354b5 scanner: simplify use of gettext
* src/scan-gram.l (unexpected_end): Leave the actual call to gettext
to the caller.
2018-11-10 17:02:50 +01:00
Akim Demaille
be737c3dd6 style: clean up the scanner and parser
* src/scan-gram.l: Formatting changes.
Add "missing" assertion for symmetry.
* src/parse-gram.y: Formatting changes.
2018-11-10 17:02:50 +01:00
Akim Demaille
e605ad9679 build: fix use of gnulib Make variables
Reported by Kiyoshi Kanazawa.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00048.html

* lib/local.mk (lib_libbison_a_LIBADD): Merge into...
* src/local.mk (src_bison_LDADD): here.
2018-10-30 07:01:21 +01:00
Akim Demaille
96f503e197 style: clean up src/AnnotationList.c
* src/AnnotationList.c: Reduce scopes.
2018-10-28 17:56:22 +01:00
Akim Demaille
9912dd28ca style: clean up print.c
* src/print.c: Reduce scopes.
2018-10-28 16:32:12 +01:00
Akim Demaille
7c4b40de61 build: remove a few copies of the Copyright from the generated Makefile
* build-aux/local.mk, cfg.mk, examples/calc++/local.mk,
* examples/local.mk, examples/mfcalc/local.mk,
* examples/rpcalc/local.mk, lib/local.mk, src/local.mk,
* tests/local.mk:
Use Automake comments so that we don't get a copy of each in the
generated Makefile.
2018-10-24 06:18:57 +02:00
Akim Demaille
0308dfb039 regen 2018-10-23 09:08:57 +02:00