The "identifier and colon" of a rule is implemented as a single token,
but whose location is only that of the identifier (so that messages
about the lhs of a rule are accurate). When reducing empty rules, the
default location is the single point location on the end of the
previous symbol. As a consequence, when Bison parses a grammar, the
location of the right-hand side of an empty rule is based on the
lhs, *independently of the position of the colon*. And the colon can
be way farther, separated by comments, white spaces, including empty
lines.
As a result, some messages look really bad. For instance:
$ cat foo.y
%%
foo : /* empty */
bar
: /* empty */
gives
$ bison -Wall foo.y
foo.y:2.4: warning: empty rule without %empty [-Wempty-rule]
2 | foo : /* empty */
| ^
foo.y:3.4: warning: empty rule without %empty [-Wempty-rule]
3 | bar
| ^
The carets are not at the right column, not even the right line.
This commit passes the colon "again" after the "id colon" token, which
gives more accurate locations for these messages:
$ bison -Wall foo.y
foo.y:2.10: warning: empty rule without %empty [-Wempty-rule]
2 | foo : /* empty */
| ^
foo.y:4.2: warning: empty rule without %empty [-Wempty-rule]
4 | : /* empty */
| ^
* src/scan-gram.l (SC_AFTER_IDENTIFIER): Rollback the colon, so that
we scan it again afterwards.
(INITIAL): Scan colons.
* src/parse-gram.y (COLON): New.
(rules): Parse the colon after the rule's id_colon (and possible
named reference).
* tests/actions.at, tests/conflicts.at, tests/diagnostics.at,
* tests/existing.at: Adjust.
Bruno Haible just added a default implementation of libtextstyle's
interface when the library is not available.
https://lists.gnu.org/archive/html/bison-patches/2019-03/msg00025.html
* gnulib: Update.
* bootstrap.conf: Replace libtextstyle with libtextstyle-optional.
* src/complain.c, src/getargs.c: Remove now useless cpp guards.
Currently when --defines is used, we generate a header, and paste an
exact copy of it into the generated parser implementation file. Let's
provide a means to #include it instead.
We don't do it by default because of the Autotools' ylwrap. This
program wraps invocations of yacc (that uses a fixed output name:
y.tab.c, y.tab.h, y.output) to support a more modern naming
scheme (dir/foo.y -> dir/foo.tab.c, dir/foo.tab.h, etc.). It does
that by renaming the generated files, and then by running sed to
propagate these renamings inside the files themselves.
Unfortunately Automake's Makefiles uses Bison as if it were Yacc (with
--yacc or with -o y.tab.c) and invoke bison via ylwrap. As a
consequence, as far as Bison is concerned, the output files are
y.tab.c and y.tab.h, so it emits '#include "y.tab.h"'. So far, so
good. But now ylwrap processes this '#include "y.tab.h"' into
'#include "dir/foo.tab.h"', which is not guaranteed to always work.
So, let's do the Right Thing when the output file is not y.tab.c, in
which case the user should %define api.header.include. Binding this
behavior to --yacc is tempting, but we recently told people to stop
using --yacc (as it also enables the Yacc warnings), but rather to use
-o y.tab.c.
Yacc.c is the only skeleton concerned: all the others do include their
header.
* data/skeletons/yacc.c (b4_header_include_if): New.
(api.header.include): Provide a default value when the output is not
y.tab.c.
* src/parse-gram.y (api.header.include): Define.
Reported by Hans Åberg.
https://lists.gnu.org/archive/html/help-bison/2019-02/msg00064.html
* src/files.c (spec_graph_file): Use `*.gv` when 3.4 or better,
otherwise `*.dot`.
* src/parse-gram.y (handle_require): Pretend we are already 3.4.
* doc/bison.texi: Adjust.
* tests/local.at, tests/output.at: Exercise this.
Currently we have no simple example: rpcalc in reverse Polish, mfcalc
has functions, and lexcalc is using lex.
* examples/c/calc/Makefile, examples/c/calc/calc.y,
* examples/c/calc/calc.test, examples/c/calc/local.mk: New.
* maint:
maint: post-release administrivia
version 3.3.2
style: minor fixes
NEWS: named constructors are preferable to symbol_type ctors
gram: fix handling of nterms in actions when some are unused
style: rename local variable
CI: update the ICC serial number for travis-ci.org
Since Bison 3.3, semantic values in rule actions (i.e., '$...') are
passed to the m4 backend as the symbol number. Unfortunately, when
there are unused symbols, the symbols are renumbered _after_ the
numbers were used in the rule actions. As a result, the evaluation of
the skeleton failed because it used non existing symbol numbers.
Which is the happy scenario: we could use numbers of other existing
symbols...
Reported by Balázs Scheidler.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00044.html
Translating the rule actions after the symbol renumbering moves too
many parts in bison. Relying on the symbol identifiers is more
troublesome than it might first seem: some don't have an
identifier (tokens with only a literal string), some might have a
complex one (tokens with a literal string with characters special for
M4). Well, these are tokens, but nterms also have issues: "dummy"
nterms (for midrule actions) are named $@32 etc. which is risky for
M4.
Instead, let's simply give M4 the mapping between the old numbers and
the new ones. To avoid confusion between old and new numbers, always
emit pre-renumbering numbers as "orig NUM".
* data/README: Give details about "orig NUM".
* data/skeletons/bison.m4 (__b4_symbol, _b4_symbol): Resolve the
"orig NUM".
* src/output.c (prepare_symbol_definitions): Pass nterm_map to m4.
* src/reduce.h, src/reduce.c (nterm_map): Extract it from
nonterminals_reduce, to make it public.
(reduce_free): Free it.
* src/scan-code.l (handle_action_dollar): When referring to a nterm,
use "orig NUM".
* tests/reduce.at (Useless Parts): New, based Balázs Scheidler's
report.
When debugging Bison itself, this is very handy, especially when
tweaking the frontend badly enough to break the backends. It can also
be used to check a grammar.
* src/getargs.h, src/getargs.c (feature_syntax_only): New.
(feature_args, feature_types): Adjust.
* src/main.c (main): Use it.
The previous name was historical and inconsistent.
* src/muscle-tab.c (define_directive): Use the proper value passing
syntax, based on the muscle kind.
(muscle_percent_variable_update): Use the right value passing syntax.
Migrate from parser_class_name to api.parser.class.
* data/skeletons: Migrate from parser_class_name to api.parser.class.
* doc/bison.texi (%define Summary): Document both parser_class_name
and api.parser.class.
Promote the latter over the former.
We used to define such auxiliary structures outside the class, mainly
as a matter of style to keep the definition of yy::parser short and
simple. However, now there's a lot more code generated inside the
class definition (e.g., all the token constructors), so the
readability no longer applies.
However, if we move stack (and slice) inside yy::parser, then it
should no longer be needed to change the namespace to have multiple
parsers: changing the class name should suffice.
One common argument against inner classes is that they code bloat. It
hardly applies here, since typically different parsers will have
different semantic value types, hence different actual stack types.
* data/skeletons/lalr1.cc: Invoke b4_stack_define inside yy::parser.
Suggested by David Barto
https://lists.gnu.org/archive/html/help-bison/2015-02/msg00004.html
and Victor Zverovich.
https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00121.html
This is very easy to do, thanks to work by Bruno Haible in gnulib.
See "Supporting Relocation" in gnulib's documentation.
* bootstrap.conf: We need relocatable-prog and relocatable-script (for yacc).
* src/yacc.in: New.
* configure.ac, src/local.mk: Instantiate it.
* src/main.c, src/output.c (main, pkgdatadir): Use relocatable2.
* doc/bison.texi (FAQ): Document it.
Suggested by Wolfgang Thaller.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00081.html
* data/c++.m4 (basic_symbol, by_type): Instead of provide either move
or copy constructor, always provide the copy one.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check it.
Instead of introducing make_symbol (whose name, btw, somewhat
infringes on the user's "name space", if she defines a token named
"symbol"), let's make the construction of symbol_type safer, using
assertions.
For instance with:
%token ':' <std::string> ID <int> INT;
generate:
symbol_type (int token, const std::string&);
symbol_type (int token, const int&);
symbol_type (int token);
It does mean that now named token constructors (make_ID, make_INT,
etc.) go through a useless assert, but I think we can ignore this: I
assume any decent compiler will inline the symbol_type ctor inside the
make_TOKEN functions, which will show that the assert is trivially
verified, hence I expect no code will be emitted for it. And anyway,
that's an assert, NDEBUG controls it.
* data/c++.m4 (symbol_type): Turn into a subclass of
basic_symbol<by_type>.
Declare symbol constructors when variants are enabled.
* data/variant.hh (_b4_type_constructor_declare)
(_b4_type_constructor_define): Replace with...
(_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
Generate symbol_type constructors.
* doc/bison.texi (Complete Symbols): Document.
* tests/types.at: Check.
After having spent quite some time on cleaning the handling of symbol
declarations in the grammar files, I believe we should keep it.
It looks like it's a duplicate of %type, but it is not. While POSIX
Yacc requires %type to apply only to nonterminal symbols, it appears
that both byacc and bison accept it for tokens too. And some
experienced users do actually expect this feature to group
symbols (terminal or not) by type ("On the other hand, it is generally
more useful IMHO to group terminals and non-terminals with the same
type tag together",
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html).
Even Bison's own parser does this today (see CHAR).
Basically reverts 7928c3e6fb.
* src/scan-gram.l (%nterm): Dedeprecate, but issue a Wyacc warning.
* tests/input.at: Adjust expectations.
(Yacc warnings on symbols): New.
* src/symtab.c (symbol_class_set): Fix error introduced in
20b0746793.
Convert some of the READMEs to Markdown, which is now more common, and
nicely displayed in some git hosting services.
Add missing READMEs and Makefiles. Generate XML, HTML and Dot files. Be
sure to ship the test files. Complete CLEANFILES to remove all generated
files.
* examples/calc++: Move into...
* examples/c++: here.
* examples/mfcalc, examples/rpcalc: Move into...
* examples/c: here.
* examples/README.md, examples/c++/calc++/Makefile, examples/c/local.mk,
* examples/c/mfcalc/Makefile, examples/c/rpcalc/Makefile,
* examples/d/README.md, examples/java/README.md:
New files.
* examples/test (medir): Be robust to deeper directory nesting.
On a grammar such as
exp: "num" | "num" | "num"
we currently report only one RR conflict, instead of two.
This bug is present since the origins of Bison
commit 08089d5d35
Author: David MacKenzie <djm@djmnet.org>
Date: Tue Apr 20 05:42:52 1993 +0000
Initial revision
and was preserved in
commit 676385e29c
Author: Paul Hilfinger <Hilfinger@CS.Berkeley.EDU>
Date: Fri Jun 28 02:26:44 2002 +0000
Initial check-in introducing experimental GLR parsing. See entry in
ChangeLog dated 2002-06-27 from Paul Hilfinger for details.
See
https://lists.gnu.org/archive/html/bison-patches/2018-11/msg00011.html
* src/conflicts.h, src/conflicts.c (count_state_rr_conflicts)
(count_rr_conflicts): Use only the correct count of conflicts.
* tests/glr-regression.at: Fix expectations.
It is unfortunate that %error_verbose was properly diagnosed as
obsoleted by "%define parse.error verbose", but %error-verbose was
not.
* src/parse-gram.y (%error-verbose): Remove support.
* src/scan-gram.l: Do it here instead, with a warning.
* tests/input.at (Deprecated directives): Check it.
Support for DJGPP was announced to be removed in the NEWS of Bison
3.1 (2018-08-27) unless someone expressed interest. There was no answer.
* djgpp: Remove.
* NEWS, Makefile.am, cfg.mk, po/POTFILES.in: Adjust.