Commit Graph

422 Commits

Author SHA1 Message Date
Akim Demaille
8eaddf326b multistart: turn start symbols into rules on $accept
Now that the parser can read several start symbols, let's process
them, and create the corresponding rules.

* src/parse-gram.y (grammar_declaration): Accept a list of start symbols.
* src/reader.h, src/reader.c (grammar_start_symbol_set): Rename as...
(grammar_start_symbols_set): this.

* src/reader.h, src/reader.c (start_flag): Replace with...
(start_symbols): this.
* src/reader.c (grammar_start_symbols_set): Build a list of start
symbols.
(switching_token, create_start_rules): New.
(check_and_convert_grammar): Use them to turn the list of start
symbols into a set of rules.
* src/reduce.c (nonterminals_reduce): Don't complain about $accept,
it's an internal detail.
(reduce_grammar): Complain about all the start symbols that don't
derive sentences.

* src/symtab.c (startsymbol, startsymbol_loc): Remove, replaced by
start_symbols.
symbols_pack): Move the check about the start symbols
to...
* src/symlist.c (check_start_symbols): here.
Adjust to multiple start symbols.
* tests/reduce.at (Empty Language): Generalize into...
(Bad start symbols): this.
2020-09-27 09:23:51 +02:00
Akim Demaille
e50ec28153 reader: get ready to create several initial rules
* src/reader.c (create_start_rule): New.
Use it.
2020-09-27 09:23:50 +02:00
Akim Demaille
0711dca9d9 add support for --html
* bootstrap.conf: We need the "execute" module.
* src/files.h, src/files.c (spec_html_file, html_flag): New.
* src/getargs.h, src/getargs.c (--html): New.
* src/print-xml.h, src/print-xml.c (print_html): New.
* src/main.c: Use them.
* tests/output.at, tests/report.at: Check --html.
2020-09-19 17:49:03 +02:00
Valentin Tolmer
ef09bf065a glr2.cc: fork glr.cc to a c++ version
This is a fork of glr.cc to be c++-first instead of a wrapper around
glr.c.

* data/skeletons/glr2.cc: New.
* data/skeletons/bison.m4, data/skeletons/c++.m4: Adjust.
* data/skeletons/c.m4 (b4_user_args_no_comma): New.
* src/reader.c (grammar_rule_check_and_complete): glr2.cc is C++.
* tests/actions.at, tests/c++.at, tests/calc.at, tests/conflicts.at,
* tests/input.at, tests/local.at, tests/regression.at, tests/scanner.at,
* tests/synclines.at, tests/types.at: Also check glr2.cc.
2020-08-30 10:45:21 +02:00
Akim Demaille
b7aab2dbad fix: crash when redefining the EOF token
Reported by Agency for Defense Development.
https://lists.gnu.org/r/bug-bison/2020-08/msg00008.html

On an empty such as

    %token FOO
           BAR
           FOO 0
    %%
    input: %empty

we crash because when we find FOO 0, we decrement ntokens (since FOO
was discovered to be EOF, which is already known to be a token, so we
increment ntokens for it, and need to cancel this).  This "works well"
when EOF is properly defined in one go, but here it is first defined
and later only assign token code 0.  In the meanwhile BAR was given
the token number that we just decremented.

To fix this, assign symbol numbers after parsing, not during parsing,
so that we also saw all the explicit token codes.  To maintain the
current numbers (I'd like to keep no difference in the output, not
just equivalence), we need to make sure the symbols are numbered in
the same order: that of appearance in the source file.  So we need the
locations to be correct, which was almost the case, except for nterms
that appeared several times as LHS (i.e., several times as "foo:
...").  Fixing the use of location_of_lhs sufficed (it appears it was
intended for this use, but its implementation was unfinished: it was
always set to "false" only).

* src/symtab.c (symbol_location_as_lhs_set): Update location_of_lhs.
(symbol_code_set): Remove broken hack that decremented ntokens.
(symbol_class_set, dummy_symbol_get): Don't set number, ntokens and
nnterms.
(symbol_check_defined): Do it.
(symbols): Don't count nsyms here.
Actually, don't count nsyms at all: let it be done in...
* src/reader.c (check_and_convert_grammar): here.  Define nsyms from
ntokens and nnterms after parsing.
* tests/input.at (EOF redeclared): New.

* examples/c/bistromathic/bistromathic.test: Adjust the traces: in
"%nterm <double> exp %% input: ...", exp used to be numbered before
input.
2020-08-07 07:30:06 +02:00
Akim Demaille
89e42ffb4b style: fix missing space before paren
* cfg.mk (_space_before_paren_exempt): Be less laxist.
* src/output.c, src/reader.c: Fix space before paren issues.
Pacify the warnings where applicable.
2020-08-07 07:30:06 +02:00
Maarten De Braekeleer
ad6f600bb1 portability: rename accept to acceptsymbol because of MSVC
MSVC already defines this symbol.

* src/symtab.h, src/symtab.c (accept): Rename as...
(acceptsymbol): this.
Adjust dependencies.
2020-08-02 08:32:57 +02:00
Akim Demaille
0820f16ca8 style: update comments
* src/reader.c: action_obstack was removed in 2002...
* src/parse-gram.y: Better names.
* src/scan-code.h: More comments.
2020-07-05 09:59:45 +02:00
Akim Demaille
0e5cbd38b2 style: shift/reduce, not shift-reduce
* src/reader.c: here.
2020-06-28 08:33:24 +02:00
Akim Demaille
feb0bb0a59 style: rename endtoken as eoftoken
* src/symtab.h, src/symtab.c (endtoken): Rename as...
(eoftoken): this.
Adjust dependencies.
2020-06-27 17:31:59 +02:00
Akim Demaille
0895858d8e style: use 'nonterminal' consistently
* doc/bison.texi: Formatting changes.
* src/gram.h, src/gram.c (nvars): Rename as...
(nnterms): this.
Adjust dependencies.
(section): New.  Use it.
Replace "non terminal" and "non-terminal" by "nonterminal".
2020-06-27 11:39:32 +02:00
Akim Demaille
5855da4722 parser: keep string aliases as the user wrote it
Currently our scanner decodes all the escapes in the strings, and we
later reescape the strings when we emit them.

This is troublesome, as we do not respect the user input.  For
instance, when the user writes in UTF-8, we destroy her string when we
write it back.  And this shows everywhere: in the reports we show the
escaped string instead of the actual alias:

    0 $accept: . exp $end
    1 exp: . exp "\342\212\225" exp
    2    | . exp "+" exp
    3    | . exp "+" exp
    4    | . "number"
    5    | . "\303\221\303\271\341\271\203\303\251\342\204\235\303\264"

    "number"                                                    shift, and go to state 1
    "\303\221\303\271\341\271\203\303\251\342\204\235\303\264"  shift, and go to state 2

This commit preserves the user's exact spelling of the string aliases,
instead of interpreting the escapes and then reescaping.  The report
now shows:

    0 $accept: . exp $end
    1 exp: . exp "⊕" exp
    2    | . exp "+" exp
    3    | . exp "+" exp
    4    | . "number"
    5    | . "Ñùṃéℝô"

    "number"          shift, and go to state 1
    "Ñùṃéℝô"  shift, and go to state 2

Likewise, the XML (and therefore HTML) outputs are fixed.

* src/scan-gram.l (STRING, TSTRING): Do not interpret the escapes in
the resulting string.
* src/parse-gram.y (unquote, parser_init, parser_free, unquote_free)
(handle_defines, handle_language, obstack_for_unquote): New.
Use them to unquote where needed.
* tests/regression.at, tests/report.at: Update.
2020-06-13 16:56:40 +02:00
Akim Demaille
e7aff57122 style: rename user_token_number as code
This should have been done in 3.6, but I wanted to avoid introducing
conflicts into Vincent's work on counterexamples.  It turns out it's
completely orthogonal.

* data/README.md, data/skeletons/bison.m4, data/skeletons/c++.m4,
* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/java.m4,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/variant.hh, data/skeletons/yacc.c, src/conflicts.c,
* src/derives.c, src/gram.c, src/gram.h, src/output.c,
* src/parse-gram.c, src/parse-gram.y, src/print-xml.c, src/print.c,
* src/reader.c, src/symtab.c, src/symtab.h, tests/input.at,
* tests/types.at:
s/user_token_number/code/g.
Plus minor changes.
2020-05-23 08:43:58 +02:00
Akim Demaille
e50de09886 tokens: properly define the YYEOF token kind
Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file.  As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.

Give the $end token a visible kind name, YYEOF.  Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.

* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
2020-04-12 13:56:44 +02:00
Akim Demaille
cc68bbf799 bison: use consistently "token kind", not "token type"
* src/output.c, src/reader.c, src/scan-gram.l, src/tables.c: here.
2020-04-05 19:14:39 +02:00
Akim Demaille
296660304c style: comment changes
* src/symtab.h, src/lr0.c: here.
2020-02-23 08:25:53 +01:00
Victor Morales Cayuela
e09a72eeb0 diagnostics: modernize the display of submessages
Since Bison 2.7, output was indented four spaces for explanatory
statements.  For example:

    input.y:2.7-13: error: %type redeclaration for exp
    input.y:1.7-11:     previous declaration

Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:

    input.y:2.7-13: error: %type redeclaration for exp
        2 | %type <float> exp
          |       ^~~~~~~
    input.y:1.7-11: note: previous declaration
        1 | %type <int> exp
          |       ^~~~~

* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
2020-02-15 08:28:40 +01:00
Akim Demaille
8036635251 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-05 10:26:35 +01:00
Akim Demaille
28d1ca8f48 diagnostics: yacc reserves %type to nonterminals
On

    %token TOKEN1
    %type  <ival> TOKEN1 TOKEN2 't'
    %token TOKEN2
    %%
    expr:

bison -Wyacc gives

    input.y:2.15-20: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |               ^~~~~~
    input.y:2.29-31: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |                             ^~~
    input.y:2.22-27: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |                      ^~~~~~

The messages appear to be out of order, but they are emitted when the
error is found.

* src/symtab.h (symbol_class): Add pct_type_sym, used to denote
symbols appearing in %type.
* src/symtab.c (complain_pct_type_on_token): New.
(symbol_class_set): Check that %type is not applied to tokens.
(symbol_check_defined): pct_type_sym also means undefined.
* src/parse-gram.y (symbol_decl.1): Set the class to pct_type_sym.
* src/reader.c (grammar_current_rule_begin): pct_type_sym also means
undefined.
* tests/input.at (Yacc's %type): New.
2019-11-17 09:45:25 +01:00
Akim Demaille
8228d96d33 reader: reduce the "scope" of global variables
We have too many global variables, adding structure would help.  For a
start, let's hide some of the variables closer to their usage.

* src/getargs.c, src/files.h (current_file): Move to...
* src/scan-gram.c: here.
* src/scan-gram.h (gram_in, gram__flex_debug): Remove, make them
private to the scanner.
* src/reader.h, src/reader.c (reader): Take a grammar file as argument.
Move the handling of scanner variables to...
* src/scan-gram.l (gram_scanner_open, gram_scanner_close): here.
(gram_scanner_initialize): Remove, replaced by gram_scanner_open.
* src/main.c: Adjust.
2019-10-26 10:39:01 +02:00
Akim Demaille
6e7d8ba6a7 reader: let symtab deal with the symbols
* src/reader.c (reader): Move the setting up of the builtin symbols to...
* src/symtab.c (symbols_new): here.
2019-10-25 07:48:07 +02:00
Yuichiro Kaneko
3945beb1d2 style: update comment in reader.c
rrhs and rlhs were removed by b2ed6e5826.

* src/reader.c (packgram): Update comment.
2019-10-23 08:32:06 +02:00
Akim Demaille
9e6c5328d3 diagnostics: also show suggested %empty
* src/reader.c (grammar_rule_check_and_complete): Suggest to add %empty.
* tests/actions.at, tests/diagnostics.at: Adjust expectations.
2019-10-06 12:15:12 +02:00
Paul Eggert
133edcd248 Prefer signed to unsigned integers
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. 	Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
2019-10-02 17:11:33 -07:00
Akim Demaille
8c06cb9130 fixits: be sure to preserve the action when adding %empty
Currently we remove the rhs to install %empty instead.

* src/reader.c (grammar_rule_check_and_complete): Insert the missing
%empty in front of the rhs, not in replacement thereof.
* tests/actions.at (Add missing %empty): Check that.
2019-05-03 16:28:28 +02:00
Akim Demaille
013720f0e7 style: use consistently *_loc for locations
Some members are called foo_location, others are foo_loc.  Stick to
the latter.

* src/gram.h, src/location.h, src/location.c, src/output.c,
* src/parse-gram.y, src/reader.h, src/reader.c, src/reduce.c,
* src/scan-gram.l, src/symlist.h, src/symlist.c, src/symtab.h,
* src/symtab.c:
Use _loc consistently, not _location.
2019-05-03 16:28:28 +02:00
Akim Demaille
365b4d95a4 style: clarify the use of symbol_lists' locations
symbol_list features a 'location' and a 'sym_loc' member.  The former
is expected to be set only for symbol_lists that denote a symbol (not
a type name), and the latter should only denote the location of the
symbol/type name.  Yet both are set, and the name "location" is too
unprecise.

* src/symlist.h, src/symlist.c (symbol_list::location): Rename as
rhs_loc for clarity.  Move it to the "section" of data valid only
for rules.
* src/reader.c, src/scan-code.l: Adjust.
2019-05-03 16:28:28 +02:00
Akim Demaille
57290d63fd package: various fixes for syntax-check
* cfg.mk: Disable checks where needed (e.g., we do want to check the
behavior with tabs).
(sc_at_parser_check): Remove.  Unfortunately since
a11c144609 we no longer use the './'
prefix to run programs in the current directory.  That was so that we
could run Java programs like the other, although they are no run with
the `./` prefix (see 967a59d2c0).
As a consequence this sc check no longer makes sense.
However, since now AT_PARSER_CHECK passes the `./` prefix itself, this
sc-check was superfluous.
* examples/c/reccalc/scan.l: Use memcpy, not strncpy.
* src/ielr.c, src/reader.c: Obfuscate "lr(0)" so that the sc-check for
"space before paren" does not fire.
* tests/diagnostics.at: Avoid space-tab, use tab-tab.
2019-04-28 08:24:31 +02:00
Akim Demaille
971e72514f updates: insert/remove %empty
* src/reader.c (grammar_rule_check_and_complete): Generate fixits for
adding/removing %empty.
* tests/actions.at, tests/diagnostics.at, tests/existing.at: Adjust.
2019-04-24 13:21:24 +02:00
Akim Demaille
ae91c3cce3 reader: clarify variable names
* src/reader.c (grammar_rule_check_and_complete): When 'p' and 'lhs'
are aliases, prefer the latter, for clarity and consistency.
(grammar_current_rule_begin): Avoid 'p', current_rule suffices.
* src/gram.h, src/gram.c: Comment changes.

ptdr#	calc.tab.c
2019-03-24 18:40:46 +01:00
Akim Demaille
e346210c03 add LR(0) output
This should not be used to generate parsers.  My point is actually to
facilitate debugging (when tweaking the generation of the LR(0)
automaton for instance, not carying -yet- about lookaheads).

* src/reader.c (prepare_percent_define_front_end_variables): Add lr(0).
* src/conflicts.c (set_conflicts): Be robust to reds not having
lookaheads at all.
* src/ielr.c (LrType, lr_type_get): Adjust.
(ielr): Implement support for LR(0).
* src/lalr.c (lalr_free): Don't free LA when it's not computed.
2019-02-05 19:02:09 +01:00
Akim Demaille
9566232422 style: comment and name changes
* src/output.c (prepare_symbol_names): here.
* src/reader.c: Remove obsolete comment.
* src/scan-code.l: Use || for Boolean or.
2019-02-02 17:32:10 +01:00
Akim Demaille
dc654a925c style: comment changes
* src/reader.c, src/scan-code.l: here.
2019-02-02 17:32:04 +01:00
Akim Demaille
2c8fb4d126 style: rename duplicate_directive as duplicate_rule_directive
* src/complain.h, src/complain.c: here.
Adjust callers.
2019-01-16 07:59:25 +01:00
Akim Demaille
2471733f1a package: bump copyrights to 2019 2019-01-05 14:58:05 +01:00
Akim Demaille
fdceb6330f symbols: set tag_seen when assigning a type to symbols
* src/reader.h, src/reader.c (tag_seen): Move to...
* src/symtab.h, src/symtab.c: here.
(symbol_type_set): Set it to true.
* src/parse-gram.y: Don't.
2018-12-15 17:41:25 +01:00
Akim Demaille
2b2556b41c style: reduce scopes
* src/conflicts.c, src/reader.c: Minor style changes.
2018-11-21 22:08:47 +01:00
Paul Hilfinger
b34b12c4f9 allow %expect and %expect-rr modifiers on individual rules
This change allows one to document (and check) which rules participate
in shift/reduce and reduce/reduce conflicts.  This is particularly
important GLR parsers, where conflicts are a normal occurrence.  For
example,

    %glr-parser
    %expect 1
    %%

    ...

    argument_list:
      arguments %expect 1
    | arguments ','
    | %empty
    ;

    arguments:
      expression
    | argument_list ',' expression
    ;

    ...

Looking at the output from -v, one can see that the shift-reduce
conflict here is due to the fact that the parser does not know whether
to reduce arguments to argument_list until it sees the token AFTER the
following ','.  By marking the rule with %expect 1 (because there is a
conflict in one state), we document the source of the 1 overall shift-
reduce conflict.

In GLR parsers, we can use %expect-rr in a rule for reduce/reduce
conflicts.  In this case, we mark each of the conflicting rules.  For
example,

    %glr-parser
    %expect-rr 1

    %%

    stmt:
      target_list '=' expr ';'
    | expr_list ';'
    ;

    target_list:
      target
    | target ',' target_list
    ;

    target:
      ID %expect-rr 1
    ;

    expr_list:
      expr
    | expr ',' expr_list
    ;

    expr:
      ID %expect-rr 1
    | ...
    ;

In a statement such as

    x, y = 3, 4;

the parser must reduce x to a target or an expr, but does not know
which until it sees the '='.  So we notate the two possible reductions
to indicate that each conflicts in one rule.

See https://lists.gnu.org/archive/html/bison-patches/2013-02/msg00105.html.

* doc/bison.texi (Suppressing Conflict Warnings): Document %expect,
%expect-rr in grammar rules.
* src/conflicts.c (count_state_rr_conflicts): Adjust comment.
(rule_has_state_sr_conflicts): New static function.
(count_rule_sr_conflicts): New static function.
(rule_nast_state_rr_conflicts): New static function.
(count_rule_rr_conflicts): New static function.
(rule_conflicts_print): New static function.
(conflicts_print): Also use rule_conflicts_print to report on individual
rules.
* src/gram.h (struct rule): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* src/reader.c (grammar_midrule_action): Transfer expected_sr_conflicts,
expected_rr_conflicts to new rule, and turn off in current_rule.
(grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
(packgram): Transfer expected_sr_conflicts, expected_rr_conflicts
to new rule.
* src/reader.h (grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
* src/symlist.c (symbol_list_sym_new): Initialize expected_sr_conflicts,
expected_rr_conflicts.
* src/symlist.h (struct symbol_list): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* tests/conflicts.at: Add tests "%expect in grammar rule not enough",
"%expect in grammar rule right.", "%expect in grammar rule too much."
2018-11-21 22:08:47 +01:00
Akim Demaille
03a13ce793 reader: recognize C++ even when it's not lalr1.cc or glr.cc
* src/reader.c (grammar_rule_check_and_complete): If a user uses her
own skeleton but sets the language to C++, recognize it as C++.
2018-10-17 17:53:51 +02:00
Akim Demaille
e3fdc37049 generate the default action only for C++
This commit adds restrictions to what was done in
01898726e2 [1].

Rici Lake [2] has shown that it's risky to disable the pre-action, at
least now.  Also, generating the default $$ = $1 action can have bad
effects in some cases [3].

The original change [1] was prompted for C++.  Let's try it there
only, for a start.  We could restrict it further to lalr1.cc with
variants, but we need to see in the wild how this change behaves.  And
it is not unreasonable to expect grammar files in C++ to behave better
wrt types.

See
[1] https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00050.html
[2] https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00061.html
[3] https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00066.html

* src/getargs.c: Style changes.
* src/reader.c (grammar_rule_check_and_complete): Complete only for
C++.
2018-10-16 13:41:09 +02:00
Akim Demaille
01898726e2 generate the default semantic action
Currently, in C, the default semantic action is implemented by being
always run before running the actual user semantic action.  As a
consequence, when the user action is run, $$ is already set as $1.

In C++ with variants, we don't do that, since we cannot manipulate the
semantic value without knowing its exact type.  When variants are
enabled, the only guarantee is that $$ is default contructed and ready
to the used.

Some users still would like the default action to be run with
variants.  Frank Heckenbach's parser in
C++17 (http://lists.gnu.org/archive/html/bug-bison/2018-04/msg00011.html)
provides this feature, but relying on std::variant's dynamic typing,
which we forbid in lalr1.cc.

The simplest seems to be actually generating the default semantic
action (in all languages/skeletons).  This makes the pre-action (that
sets $$ to $1) useless.  But...  maybe some users depend on this, in
spite of the comments that clearly warn againt this.  So let's not
turn this off just yet.

* src/reader.c (grammar_rule_check_and_complete): Rename as...
(grammar_rule_check_and_complete): this.
Install the default semantic action when applicable.
* examples/variant-11.yy, examples/variant.yy, tests/calc.at:
Exercise the default semantic action, even with variants.
2018-10-14 18:53:21 +02:00
Akim Demaille
45ef3d92a1 reader: reorder some calls to separate checks from assignments
* src/reader.c (packgram): Move assignments to rules[ruleno] after the
checks on the rule.
2018-10-14 15:20:39 +02:00
Akim Demaille
bbfa419b89 style: use midrule only, not mid-rule
The code was already using midrule only, never mid_rule.  This is
simpler to remember, and matches a similar change we made from
look-ahead to lookahead.

* NEWS, doc/bison.texi, src/reader.c, src/scan-code.h, src/scan-code.l
* tests/actions.at, tests/c++.at, tests/existing.at: here.
2018-09-19 22:09:53 +02:00
Akim Demaille
8bc4348cc7 reader: simplify the search of the start symbol
Suggested by Paul Eggert.

* src/reader.c (find_start_symbol): Don't check 'res', we know it is
not null.  That suffices to avoid the GCC warnings.
* bootstrap.conf: We don't need 'assume', which doesn't exist anyway.
2018-08-17 06:22:47 +02:00
Akim Demaille
7783ba2d4f fix incorrect C code
Commit 3df32101e7 introduced invalid C
code.  Caught by GCC 7.3.0.

* bootstrap.conf (gnulib_modules): We need assume.
* src/reader.c (find_start_symbol): Fix the signature (too much C++,
sorry...).
Prefer 'assume' to 'assert', so that we don't have these warnings even
when NDEBUG is defined.
2018-08-15 14:39:46 +02:00
Akim Demaille
9a5c688ae4 style: src: remove useless reference to 'int' in integral types
* src/AnnotationList.c, src/AnnotationList.h, src/InadequacyList.h,
* src/closure.c, src/closure.h, src/gram.c, src/gram.h, src/ielr.c,
* src/location.c, src/output.c, src/reader.c, src/relation.c,
* src/scan-code.l, src/scan-gram.l, src/tables.c, src/tables.h:
Prefer 'unsigned' to 'unsigned int'.  Likewise for long and short.
2018-08-14 06:15:41 +02:00
Akim Demaille
da8f4a2f5f rule actions cannot be typed
Make sure that we cannot apply a type to the (main) action of a rule.

* src/reader.c (grammar_rule_check): Issue the warning.
* tests/input.at (Cannot type action): Check the warning.
2018-08-11 18:09:29 +02:00
Akim Demaille
f18f71cfb0 warn about typed mid-rule actions in Yacc mode
* src/reader.c (grammar_current_rule_action_append): Warn.
* tests/input.at (AT_CHECK_UNUSED_VALUES): Check.
2018-08-11 18:09:29 +02:00
Akim Demaille
7b24c424b5 add support for typed mid-rule actions
Prompted on Piotr Marcińczyk's message:
http://lists.gnu.org/archive/html/bug-bison/2017-06/msg00000.html.
See also http://lists.gnu.org/archive/html/bug-bison/2018-06/msg00001.html.

Because their type is unknown to Bison, the values of midrule actions are
not treated like the others: they don't have %printer and %destructor
support.  In addition, in C++, (Bison) variants cannot work properly.

Typed midrule actions address these issues.  Instead of:

    exp: { $<ival>$ = 1; } { $<ival>$ = 2; }   { $$ = $<ival>1 + $<ival>2; }

write:

    exp: <ival>{ $$ = 1; } <ival>{ $$ = 2; }   { $$ = $1 + $2; }

* src/scan-code.h, src/scan-code.l (code_props): Add a `type` field to
record the declared type of an action.
(code_props_rule_action_init): Add a type argument.
* src/parse-gram.y: Accept an optional type tag for actions.
* src/reader.h, src/reader.c (grammar_current_rule_action_append): Add
a type argument.
(grammar_midrule_action): When a mid-rule is typed, pass its type to
the defined dummy non terminal symbol.
2018-08-11 18:09:29 +02:00
Akim Demaille
3df32101e7 warnings: address -Wnull-dereference in reader.c
Based on a patch by David Michael.
http://lists.gnu.org/archive/html/bison-patches/2018-07/msg00000.html

* src/reader.c (find_start): New, extracted from...
(check_and_convert_grammar): here.
2018-08-05 20:25:58 +02:00