Some members are called foo_location, others are foo_loc. Stick to
the latter.
* src/gram.h, src/location.h, src/location.c, src/output.c,
* src/parse-gram.y, src/reader.h, src/reader.c, src/reduce.c,
* src/scan-gram.l, src/symlist.h, src/symlist.c, src/symtab.h,
* src/symtab.c:
Use _loc consistently, not _location.
symbol_list features a 'location' and a 'sym_loc' member. The former
is expected to be set only for symbol_lists that denote a symbol (not
a type name), and the latter should only denote the location of the
symbol/type name. Yet both are set, and the name "location" is too
unprecise.
* src/symlist.h, src/symlist.c (symbol_list::location): Rename as
rhs_loc for clarity. Move it to the "section" of data valid only
for rules.
* src/reader.c, src/scan-code.l: Adjust.
* src/lalr.c: Move logs to a better place to understand the chronology
of events.
* src/symlist.c (symbol_list_syms_print): Don't dump core on type
elements.
Prompted by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html
We have four classes of directives that declare symbols: %nterm,
%type, %token, and the family of %left etc. Currently not all of them
support the possibility to have several type tags (`<type>`), and not
all of them support the fact of not having any type tag at all
(%type). Let's unify this.
- %type
POSIX Yacc specifies that %type is for nonterminals only. However,
some Bison users want to use it for both tokens and nterms
(actually, Bison's own grammar does this in several places, e.g.,
CHAR). So it should accept char/string literals.
As a consequence cannot be used to declare tokens with their alias:
`%type foo "foo"` would be ambiguous (are we defining foo = "foo",
or are these two different symbols?)
POSIX specifies that it is OK to use %type without a type tag. I'm
not sure what it means, but we support it.
- %token
Accept token declarations with number and string literal:
(ID|CHAR) NUM? STRING?.
- %left, etc.
They cannot be the same as %token, because we accept to declare the
symbol with %token, and to then qualify its precedence with %left.
Then `%left foo "foo"` would also be ambiguous: foo="foo", or two
symbols.
They cannot be simply a list of identifiers, but POSIX Yacc says we
can declare token numbers here. I personally think this is a bad
idea, precedence management is tricky in itself and should not be
cluttered with token declaration issues.
We used to accept declaring a token number on a string literal here
(e.g., `%left "token" 1`). This is abnormal. Either the feature is
useful, and then it should be supported in %token, or it's useless
and we should not support it in corner cases.
- %nterm
Obviously cannot accept tokens, nor char/string literals. Does not
exist in POSIX Yacc, but since %type also works for terminals, it is
a nice option to have.
* src/parse-gram.y: Avoid relying on side effects. For instance, get
rid of current_type, rather, build the list of symbols and iterate
over it to assign the type.
It's not always possible/convenient. For instance, we still use
current_class.
Prefer "decl" to "def", since in the rest of the implementation we
actually "declare" symbols, we don't "define" them.
(token_decls, token_decls_for_prec, symbol_decls, nterm_decls): New.
Use them for %token, %left, %type and %nterm.
* src/symlist.h, src/symlist.c (symbol_list_type_set): New.
* tests/regression.at b/tests/regression.at
(Token number in precedence declaration): We no longer accept
to give a number to string literals.
This change allows one to document (and check) which rules participate
in shift/reduce and reduce/reduce conflicts. This is particularly
important GLR parsers, where conflicts are a normal occurrence. For
example,
%glr-parser
%expect 1
%%
...
argument_list:
arguments %expect 1
| arguments ','
| %empty
;
arguments:
expression
| argument_list ',' expression
;
...
Looking at the output from -v, one can see that the shift-reduce
conflict here is due to the fact that the parser does not know whether
to reduce arguments to argument_list until it sees the token AFTER the
following ','. By marking the rule with %expect 1 (because there is a
conflict in one state), we document the source of the 1 overall shift-
reduce conflict.
In GLR parsers, we can use %expect-rr in a rule for reduce/reduce
conflicts. In this case, we mark each of the conflicting rules. For
example,
%glr-parser
%expect-rr 1
%%
stmt:
target_list '=' expr ';'
| expr_list ';'
;
target_list:
target
| target ',' target_list
;
target:
ID %expect-rr 1
;
expr_list:
expr
| expr ',' expr_list
;
expr:
ID %expect-rr 1
| ...
;
In a statement such as
x, y = 3, 4;
the parser must reduce x to a target or an expr, but does not know
which until it sees the '='. So we notate the two possible reductions
to indicate that each conflicts in one rule.
See https://lists.gnu.org/archive/html/bison-patches/2013-02/msg00105.html.
* doc/bison.texi (Suppressing Conflict Warnings): Document %expect,
%expect-rr in grammar rules.
* src/conflicts.c (count_state_rr_conflicts): Adjust comment.
(rule_has_state_sr_conflicts): New static function.
(count_rule_sr_conflicts): New static function.
(rule_nast_state_rr_conflicts): New static function.
(count_rule_rr_conflicts): New static function.
(rule_conflicts_print): New static function.
(conflicts_print): Also use rule_conflicts_print to report on individual
rules.
* src/gram.h (struct rule): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* src/reader.c (grammar_midrule_action): Transfer expected_sr_conflicts,
expected_rr_conflicts to new rule, and turn off in current_rule.
(grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
(packgram): Transfer expected_sr_conflicts, expected_rr_conflicts
to new rule.
* src/reader.h (grammar_current_rule_expect_sr): New function.
(grammar_current_rule_expect_rr): New function.
* src/symlist.c (symbol_list_sym_new): Initialize expected_sr_conflicts,
expected_rr_conflicts.
* src/symlist.h (struct symbol_list): Add new fields expected_sr_conflicts,
expected_rr_conflicts.
* tests/conflicts.at: Add tests "%expect in grammar rule not enough",
"%expect in grammar rule right.", "%expect in grammar rule too much."
* origin/maint:
maint: post-release administrivia
version 3.0.4
gnulib: update
build: re-enable compiler warnings, and fix them
tests: c++: fix a C++03 conformance issue
tests: fix a title
c++: reserve 200 slots in the parser's stack
tests: be more robust to unrecognized synclines, and try to recognize xlc
tests: fix C++ conformance
build: fix some warnings
build: avoid infinite recursions on include_next
There are warnings (-Wextra) in generated C++ code:
ltlparse.cc: In member function 'ltlyy::parser::symbol_number_type
ltlyy::parser::by_state::type_get() const':
ltlparse.cc:452:33: warning: enumeral and non-enumeral type in
conditional expression
return state == empty_state ? empty_symbol : yystos_[state];
Reported by Alexandre Duret-Lutz.
It turns out that -Wall and -Wextra were disabled because of a stupid
typo.
* configure.ac: Fix the stupid typo.
* data/lalr1.cc, src/AnnotationList.c, src/InadequacyList.c,
* src/ielr.c, src/print.c, src/scan-code.l, src/symlist.c,
* src/symlist.h, src/symtab.c, src/tables.c, tests/actions.at,
* tests/calc.at, tests/cxx-type.at, tests/glr-regression.at,
* tests/named-refs.at, tests/torture.at:
Fix warnings, mostly issues about variables used only with assertions,
which are disabled with -DNDEBUG.
* origin/maint:
tests: split a large test case into several smaller ones
package: a bit of trouble shooting indications
doc: liby's main arms the internationalization
bison: avoid warnings from static code analysis
c++: fix the use of destructors when variants are enabled
style: tests: simplify the handling of some C++ tests
c++: symbols can be empty, so use it
c++: variants: don't leak the lookahead in error recovery
c++: provide a means to clear symbols
c++: clean up the handling of empty symbols
c++: comment and style changes
c++: variants: comparing addresses of typeid.name() is undefined
c++: locations: complete the API and fix comments
build: do not clean figure sources in make clean
A static analysis tool reports that some callers of symbol_list_n_get
might get NULL and not handle it properly. This is not the case, yet
we can suppress this pattern.
Reported by Mike Sullivan.
<https://lists.gnu.org/archive/html/bug-bison/2013-12/msg00027.html>
* src/symlist.c (symbol_list_n_get): Actually it is never called
to return 0. Enforce this postcondition via aver.
(symbol_list_n_type_name_get): Simplify accordingly. In particular,
discards a (translated) useless error message.
* src/symlist.h: Adjust documentation.
* src/scan-code.l: Style change.
* origin/maint:
build: don't try to generate docs when cross-compiling
package: fix a reporter's name
%union: fix the support for named %union
package: bump to 2015
flex: don't trust YY_USER_INIT
yacc.c: fix broken union when api.value.type=union and %defines are used
doc: fix missing xref
gnulib: update
location: remove some ugly debugging code traces
build: use abort to pacify compiler errors
package: bump to 2014
doc: specify documentation encoding
Rather than having duplicate info in the symbol and the alias that has
to be resolved later on, both the symbol and the alias have a common
pointer to a separate structure containing this info.
* src/symtab.h (sym_content): New structure.
* src/symtab.c (sym_content_new, sym_content_free, symbol_free): New
* src/AnnotationList.c, src/conflicts.c, src/gram.c, src/gram.h,
* src/graphviz.c, src/ielr.c, src/output.c, src/parse-gram.y, src/print.c
* src/print-xml.c, src/print_graph.c, src/reader.c, src/reduce.c,
* src/state.h, src/symlist.c, src/symtab.c, src/symtab.h, src/tables.c:
Adjust.
* tests/input.at: Fix expectations (order changes).
When reporting a duplicate directive on a rule, point to its first
occurrence:
one.y:11.10-15: error: only one %empty allowed per rule
%empty {} %empty
^^^^^^
one.y:11.3-8: previous declaration
%empty {} %empty
^^^^^^
And consistently discard the second one.
* src/complain.h, src/complain.c (duplicate_directive): New.
* src/reader.c: Use it where appropriate.
* src/symlist.h, src/symlist.c (symbol_list): Add a dprec_location member.
* tests/actions.at: Adjust expected output.
Provide a means to explicitly denote empty right-hand sides of rules:
instead of
exp: { ... }
allow
exp: %empty { ... }
Make sure that %empty is properly used.
With help from Joel E. Denny and Gabriel Rassoul.
http://lists.gnu.org/archive/html/bison-patches/2013-01/msg00142.html
* src/reader.h, src/reader.c (grammar_current_rule_empty_set): New.
* src/parse-gram.y (%empty): New token.
Use it.
* src/scan-gram.l (%empty): Scan it.
* src/reader.c (grammar_rule_check): Check that %empty is properly used.
* tests/actions.at (Invalid uses of %empty, Valid uses of %empty): New.
* src/complain.c: Space changes.
* src/reader.c: Comment changes.
Avoid && in assertions.
* src/location.c: Move comments to...
* src/location.h: here.
* src/symlist.h, src/symlist.c: Create a pseudo section for members
that apply to the rule.
In a declaration %token A B, the token A is declared before B, but in %left
A B (or with %precedence or %nonassoc or %right), the token B was declared
before A (tokens were declared in reverse order).
* src/symlist.h, src/symlist.c (symbol_list_append): New.
* src/parse-gram.y: Use it instead of symbol_list_prepend.
* tests/input.at: Adjust expectations.
* src/symlist.c (symbol_list_free): Deep free it.
* src/symtab.c (symbols_free, semantic_types_sorted): Free it too.
(symbols_do, sorted): Call by address.
* origin/maint: (29 commits)
regen
synclines: remove spurious empty line
also support $<foo>$ in the %initial-action
skeletons: b4_dollar_pushdef and popdef to simpify complex definitions
regen
printer/destructor: translate only once
factor the handling of m4 escaping
news: schedule the removal of the ";" hack
style changes in the scanners
regen
support $<tag>$ in printers and destructors
scan-code: factor the handling of the type in $<TYPE>$
muscles: fix another occurrence of unescaped type name
glr.cc: fix the handling of yydebug
gnulib: update
formatting changes
tests: fix an assertion
tests: adjust to GCC 4.8, which displays caret errors
be sure to properly escape type names
obstack_quote: escape and quote for M4
muscles: shuffle responsabilities
muscles: make private functions static
muscles: rename private functions/macros
obstack_escape: escape M4 characters
remove dead macro
maint: style changes
doc: avoid problems with case insensitive file systems
configure: fix botched quoting
news: fix typo.
Conflicts:
NEWS
data/c.m4
data/glr.cc
data/lalr1.cc
examples/rpcalc/local.mk
src/muscle-tab.h
src/output.c
src/parse-gram.c
src/parse-gram.h
src/parse-gram.y
src/scan-code.l
src/symlist.c
src/symlist.h
src/symtab.h
tests/calc.at
Currently "%printer {...} a b c d e f" translates the {...} six times.
Not only is this bad for time and space, it also issues six times the
same warnings.
* src/symlist.h, src/symlist.c (symbol_list_destructor_set)
(symbol_list_printer_set): Take the action as code_props instead of
const char *.
* src/parse-gram.y: Translate these actions here.
* src/scan-code.h: Comment change.
* tests/input.at: Check that warnings are issued only once.
Currently they are treated in separated variables, contrary to other
<TYPE> code_props. This duplicates code (and messages for translators)
uselessly, as demonstrated by the fact that thanks to this patch, now
useless <*> and <> code_props are reported like the others.
* src/parse-gram.y (generic_symlist_item): Treat "<*>" and "<>" as regular
type tags.
* src/symlist.h, src/symlist.c (symbol_list_default_tagged_new)
(symbol_list_default_tagless_new,SYMLIST_DEFAULT_TAGGED)
(SYMLIST_DEFAULT_TAGLESS): Remove.
* src/symtab.h, src/symtab.c (default_tagged_code_props)
(default_tagless_code_props, default_tagged_code_props_set)
(default_tagless_code_props_set): Remove.
(symbol_code_props_get): Default to <*> or <>'s code_props.
* tests/actions.at: Complete expected errors: there are new warnings.
* tests/input.at: Likewise.
(Useless printers or destructors): Extend.
* src/symtab.h (symbol_list): Represent semantic types as structure
'semantic_type'.
* src/symlist.c (symbol_list_type_new): Allocate this structure.
(symbol_list_code_props_set): Set this semantic type's status to used if it
was not declared.
* src/symtab.c (semantic_types_sorted): New.
(semantic_type_new): Set the new semantic type's location appropriately.
(symbol_check_defined): If a symbol has a type, then set this type's status
to "declared".
(semantic_type_check_defined, semantic_type_check_defined_processor): Same
as symbol_check_defined and symbol_check_defined_processor, but for semantic
types.
(symbol_check_defined): Check semantic types usefulness.
* src/symtab.h (semantic_type): New fields 'location' and 'status'.
* src/symtab.h, src/symtab.c (semantic_type_new)
(semantic_type_from_uniqstr, semantic_type_get): Accept a location as a
supplementary argument.
* tests/input.at (Unassociated types used for printer of destructor): New.
* tests/c++.at (AT_CHECK_VARIANTS): Fix an error caught by this commit.
There is too much code duplication between %printer and %destructor.
We used to have two functions for each action: the first one for
destructors, the second one for printers. Factor using a
'code_props_type', and an array of code_props instead of two
members.
* src/symlist.h, src/symlist.c (symbol_list_destructor_set)
(symbol_list_printer_set): Fuse into...
(symbol_list_code_props_set): this.
* src/symtab.h, src/symtab.c (default_tagged_destructor)
(default_tagged_printer): Fuse into...
(default_tagged_code_props): this.
(default_tagless_destructor, default_tagless_printer)
(default_tagless_code_props): Likewise.
(code_props_type_string): new.
(symbol_destructor_set, symbol_destructor_get, semantic_type_destructor_set)
(default_tagged_destructor_set, default_tagless_destructor_set)
(symbol_printer_set, symbol_printer_get, semantic_type_printer_set)
(default_tagged_printer_set, default_tagless_printer_set): Replace by...
(symbol_code_props_set, symbol_code_props_get, semantic_type_code_props_set)
(default_tagged_code_props_set, default_tagless_code_props_set): these.
* src/parse-gram.y (grammar_declaration): Adjust.
* src/output.c (CODE_PROP, grammar_declaration): Ditto.
* src/reader.c (symbol_should_be_used): Ditto.
The previous commit, which turns into a warning what used to be an
error:
%printer {} foo;
%%
exp: '0';
has two shortcomings: the warning is way too long (foo is reported
to be useless later), and besides, it also turns into a warning much
more serious errors:
%printer {} foo;
%%
exp: foo;
Reduce the amount to warnings in the first case, restore the error in
the second.
* src/symtab.h (status): Add a new inital state: undeclared.
* src/symtab.c (symbol_new): Initialize to undeclared.
(symbol_class_set): Simplify the logic of the code that neutralize
the "redeclared" warning after the "redefined" one.
(symbol_check_defined): "undeclared" is also an error.
* src/reader.c (grammar_current_rule_symbol_append): Symbols appearing
in a rule are "needed".
* src/symlist.c (symbol_list_destructor_set, symbol_list_printer_set):
An unknown symbol appearing in a %printer/%destructor is "used".
* src/reduce.c (nonterminals_reduce): Do not report as "useless" symbols
that are not used (e.g., those that for instance appeared only in a
%printer).
* tests/input.at (Undeclared symbols used for a printer or destructor):
Improve the cover the cases described above.
We used to raise an error if a symbol appears only in a %printer or
%destructor. Make it a warning.
* src/symtab.h (status): New enum.
(symbol): Replace the binary "declared" with the three-state "status".
Adjust dependencies.
* src/symtab.c (symbol_check_defined): Needed symbols are an error,
whereas "used" are simply warnings.
* src/symlist.c (symbol_list_destructor_set, symbol_list_printer): Set
symbol status to 'used' when associated to destructors or printers.
* input.at (Undeclared symbols used for a printer or destructor): New.
This change was made by applying emacs' untabify function to
nearly all files in Bison's repository. Required tabs in make
files, ChangeLog, regexps, and test code were manually skipped.
Other notable exceptions and changes are listed below.
* bootstrap: Skip because we sync this with gnulib.
* data/m4sugar/foreach.m4
* data/m4sugar/m4sugar.m4: Skip because we sync these with
Autoconf.
* djgpp: Skip because I don't know how to test djgpp properly, and
this code appears to be unmaintained anyway.
* README-hacking (Hacking): Specify that tabs should be avoided
where not required.
In `rhs[name]: "a" | "b"', do not free "name" twice.
Reported by Tys Lefering.
<http://lists.gnu.org/archive/html/bug-bison/2010-06/msg00002.html>
* src/named-ref.h, src/named-ref.c (named_ref_copy): New.
* src/parse-gram.y (current_lhs): Rename as...
(current_lhs_symbol): this.
(current_lhs): New function. Use it to free the current lhs
named reference.
* src/reader.c: Bind lhs to a copy of the current named reference.
* src/symlist.c: Rely on free (0) being valid.
* tests/named-refs.at: Test this.
(cherry picked from commit 8f462efe92)
Conflicts:
src/parse-gram.y
In `rhs[name]: "a" | "b"', do not free "name" twice.
Reported by Tys Lefering.
<http://lists.gnu.org/archive/html/bug-bison/2010-06/msg00002.html>
* src/named-ref.h, src/named-ref.c (named_ref_copy): New.
* src/parse-gram.y (current_lhs): Rename as...
(current_lhs_symbol): this.
(current_lhs): New function. Use it to free the current lhs
named reference.
* src/reader.c: Bind lhs to a copy of the current named reference.
* src/symlist.c: Rely on free (0) being valid.
* tests/named-refs.at: Test this.
* doc/bison.texinfo (Mid-Rule Actions): Mention that periods and
dashes make symbol names less convenient for named references.
* src/scan-code.l:
(handle_action_dollar): New arg textlen. All callers changed.
(handle_action_at): Likewise. Also, args are pointers to const.
(ref_tail_fields): Remove; no longer used.
(letter): Now includes '-' and '.', since this is for Bison
identifiers.
(id): Now the simpler traditional defn, since letters now include
'-' and '.'.
(c_letter, c_id): New defns.
(ref): Use c_id for unbracketed IDs.
(<SC_RULE_ACTION>): Simplify, now that the distinction between
Bison and unbracketed IDs are now in the regular expressions.
(VARIANT_BAD_BRACKETING): Remove.
(VARIANT_NOT_VISIBLE_FROM_MIDRULE): Renumber.
(find_prefix_end): Remove, replacing with ....
(identifier_matches): New function.
(variant_add): Use it. Omit EXPLICIT_BRACKETING arg; no longer
needed. CP arg is pointer to constant. All callers changed.
(show_sub_messages): Remove args CP, EXPLICIT_BRACKETING, DOLLAR_OR_AT.
New arg TEXT. All callers changed. Do not worry about showing
trailing context.
(parse_ref): Args CP, RULE, TEXT are now pointers to const. New
arg TEXTLEN. Remove arg DOLLAR_OR_AT. All callers changed.
Simplify code now that the regular expressions capture the
restrictions.
* src/scan-gram.l (letter, id): Adjust to match scan-code.l.
* src/symlist.c (symbol_list_null): Arg is now pointer to const.
* src/symlist.h: Likewise.
* tests/named-refs.at (Misleading references): These are now caught
by the C compiler, not by Bison; that's good enough. Adjust test
to reflect this.
(Many kinds of errors, Unresolved references): Adjust expected
diagnostics to match new behavior. The same errors are caught,
though the diagnostics are not quite as fancy.
($ or @ followed by . or -): Likewise. Also, Make the grammar
unambiguous, so that diagnostics are not complicated by ambiguity
warnings.