The variable spec_defines_file denotes the name of the generated
header. Its name is derived from --defines/%defines, whose name in
turn is derived from the fact that the header, in Yacc, contained the
Not only does the header now contain a lot more than just the token
definitions, but we no longer even generate macros, but an enum...
Let's modernize our vocabulary.
* src/files.h, src/files.c (spec_defines_file): Rename as...
(spec_header_file): this.
Currently when --defines is used, we generate a header, and paste an
exact copy of it into the generated parser implementation file. Let's
provide a means to #include it instead.
We don't do it by default because of the Autotools' ylwrap. This
program wraps invocations of yacc (that uses a fixed output name:
y.tab.c, y.tab.h, y.output) to support a more modern naming
scheme (dir/foo.y -> dir/foo.tab.c, dir/foo.tab.h, etc.). It does
that by renaming the generated files, and then by running sed to
propagate these renamings inside the files themselves.
Unfortunately Automake's Makefiles uses Bison as if it were Yacc (with
--yacc or with -o y.tab.c) and invoke bison via ylwrap. As a
consequence, as far as Bison is concerned, the output files are
y.tab.c and y.tab.h, so it emits '#include "y.tab.h"'. So far, so
good. But now ylwrap processes this '#include "y.tab.h"' into
'#include "dir/foo.tab.h"', which is not guaranteed to always work.
So, let's do the Right Thing when the output file is not y.tab.c, in
which case the user should %define api.header.include. Binding this
behavior to --yacc is tempting, but we recently told people to stop
using --yacc (as it also enables the Yacc warnings), but rather to use
-o y.tab.c.
Yacc.c is the only skeleton concerned: all the others do include their
header.
* data/skeletons/yacc.c (b4_header_include_if): New.
(api.header.include): Provide a default value when the output is not
y.tab.c.
* src/parse-gram.y (api.header.include): Define.
* data/skeletons/yacc.c: here.
This is more logical for the time stamps, but it's also required by
following patches: the shared declarations are also in charge of
handling api.value.type=union. So far, they are run in the
implementation file in both cases (with or without header). But if we
run them only in the header, then the implementation file is emited
with incorrect support for api.value.type=union.
Arguably we should not have such dependencies. This is because we
have side-effects in our backend (redefining the symbols' type and
type_tag). In the future we should find a better solution for this,
without sacrificing the independence of the backend from bison
itself (i.e., I don't think we should handle api.value.type=union in
bison, leave it to m4).
Currently we generate things like:
#line 683 "src/parse-gram.y" /* yacc.c:316 */
The first part is of course very important: compilers point the users
to their grammar file rather than into the generated parser. The
second part points to the place in the skeletons that generated this
piece of code.
This dependency on the Bison skeletons generates lots of useless 'git
diff'. This location is useless for the regular user (who does not
care about the skeletons) and is actually not useful for Bison
developpers too (I never used this to locate the code in skeletons
that generated output). So disable it completely. If someone thinks
this was actually useful, a %define variable should be provided to
control the level of verbosity of '#line', in replacement of
--no-lines.
So now, generate:
#line 683 "src/parse-gram.y"
* data/skeletons/bison.m4 (b4_sync_end): Emit nothing.
* configure.ac (DCFLAGS): Define.
* tests/atlocal.in: Receive it.
* data/skeletons/d.m4 (api.parser.class): Remove spurious YY.
* data/skeletons/lalr1.d (yylex): Return an int instead of a
YYTokenType, so that we can use characters as tokens.
* examples/d/calc.y: Adjust.
* tests/local.at: Initial support for D.
(AT_D_IF, AT_DATA_GRAMMAR(D), AT_YYERROR_DECLARE(d))
(AT_YYERROR_DECLARE_EXTERN(d), AT_YYERROR_DEFINE(d))
(AT_MAIN_DEFINE(d), AT_COMPILE_D, AT_LANG_COMPILE(d), AT_LANG_EXT(d)):
New.
* tests/calc.at: Initial support for D.
* tests/headers.at
While hacking on the computation of the automaton, I had yystate being
equal to -1, and the parser loops. Let's catch this when
parser.assert is enabled.
* data/skeletons/yacc.c (YY_ASSERT): New.
Use it.
Not using the name YYASSERT, to make it clear that this is private.
glr.c should probably move to YY_ASSERT too.
Also, while at it, report 'Entering state...' even before growing the
stacks.
* maint:
maint: post-release administrivia
version 3.3.2
style: minor fixes
NEWS: named constructors are preferable to symbol_type ctors
gram: fix handling of nterms in actions when some are unused
style: rename local variable
CI: update the ICC serial number for travis-ci.org
Since Bison 3.3, semantic values in rule actions (i.e., '$...') are
passed to the m4 backend as the symbol number. Unfortunately, when
there are unused symbols, the symbols are renumbered _after_ the
numbers were used in the rule actions. As a result, the evaluation of
the skeleton failed because it used non existing symbol numbers.
Which is the happy scenario: we could use numbers of other existing
symbols...
Reported by Balázs Scheidler.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00044.html
Translating the rule actions after the symbol renumbering moves too
many parts in bison. Relying on the symbol identifiers is more
troublesome than it might first seem: some don't have an
identifier (tokens with only a literal string), some might have a
complex one (tokens with a literal string with characters special for
M4). Well, these are tokens, but nterms also have issues: "dummy"
nterms (for midrule actions) are named $@32 etc. which is risky for
M4.
Instead, let's simply give M4 the mapping between the old numbers and
the new ones. To avoid confusion between old and new numbers, always
emit pre-renumbering numbers as "orig NUM".
* data/README: Give details about "orig NUM".
* data/skeletons/bison.m4 (__b4_symbol, _b4_symbol): Resolve the
"orig NUM".
* src/output.c (prepare_symbol_definitions): Pass nterm_map to m4.
* src/reduce.h, src/reduce.c (nterm_map): Extract it from
nonterminals_reduce, to make it public.
(reduce_free): Free it.
* src/scan-code.l (handle_action_dollar): When referring to a nterm,
use "orig NUM".
* tests/reduce.at (Useless Parts): New, based Balázs Scheidler's
report.
Reported by Derek Clegg
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00021.html
aux/parser-internal.h:429:12: error: 'syntax_error' has no out-of-line virtual
method definitions; its vtable will be emitted in every translation unit
[-Werror,-Wweak-vtables]
struct syntax_error : std::runtime_error
To avoid this warning, we need syntax_error to have a virtual function
defined in a compilation unit. Let it be the destructor. To comply
with C++98, this dtor should be 'throw()'. Merely making YY_NOEXCEPT
be 'throw()' in C++98 triggers
errors (http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00022.html),
so let's introduce YY_NOTHROW and flag only ~syntax_error with it.
Also, since we now have an explicit dtor, we need to provide an copy
ctor.
* configure.ac (warn_cxx): Add -Wweak-vtables.
* data/skeletons/c++.m4 (YY_NOTHROW): New.
(syntax_error): Declare the dtor, and define the copy ctor.
* data/skeletons/glr.cc, data/skeletons/lalr1.cc (~syntax_error):
Define.
Reported by Derek Clegg.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00004.html
* configure.ac (warn_common): Add -Wimplicit-fallthrough.
This does trigger failures in the test suite.
* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c, tests/c++.at:
Make fall-throws explicit.
Reported by Derek Clegg.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00006.html
Clang does not like this:
template <typename D>
struct basic_symbol : D
{
basic_symbol();
};
struct by_type {};
struct symbol_type : basic_symbol<by_type>
{
symbol_type(){}
};
It gives:
$ clang++-mp-7.0 -Wundefined-func-template foo.cc -c
foo.cc:11:3: warning: instantiation of function 'basic_symbol<by_type>::basic_symbol'
required here, but no definition is available [-Wundefined-func-template]
symbol_type(){}
^
foo.cc:4:3: note: forward declaration of template entity is here
basic_symbol();
^
foo.cc:11:3: note: add an explicit instantiation declaration to suppress this warning
if 'basic_symbol<by_type>::basic_symbol' is explicitly instantiated in
another translation unit
symbol_type(){}
^
1 warning generated.
The same applies for the basic_symbol's destructor and `clear()`.
* configure.ac (warn_cxx): Add -Wundefined-func-template.
This triggered one failure in the test suite:
* tests/headers.at (Sane headers): here, where we check that we can
compile the generated headers in other compilation units than the
parser's.
Add a variant type to make sure that basic_symbol and symbol_type are
properly generated in this case.
* data/skeletons/c++.m4 (basic_symbol): Inline the definitions of the
destructor and of `clear` in the class definition.
This line:
slice<stack_symbol_type, stack_type> slice (yystack_, yylen);
triggers warnings:
parse.h:1790:11: note: shadowed declaration is here
Reported by Frank Heckenbach.
http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00002.html
* configure.ac (warn_c): Move -Wshadow to...
(warn_common): here.
* data/skeletons/stack.hh (slice): Define as an inner class of stack.
* data/skeletons/lalr1.cc: Adjust.
Rename the variable as 'range' instead of 'slice'.
There are many macros that are defined and used just
once (b4_public_if, b4_abstract_if, etc.). That's overkill. Rather,
let's define a macro to build the "public class YYParser" line.
It appears that the same syntax with "extends", "abstract", etc. is
implemented in the D parser, which looks very fishy...
* data/skeletons/d.m4, data/skeletons/java.m4 (b4_public_if)
(b4_abstract_if, b4_final_if, b4_strictfp_if): Replace with
(b4_parser_class_declaration): this.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: Adjust.
Commit 90a8537e62 was right, but issued
two error messages. Commit 80ef7e7639
tried to address that by mapping yychar and yytoken to empty, but that
completely breaks the invariants of glr.c. In particular, yygetToken
can be called repeatedly and is expected to return the latest result,
unless yytoken is YYEMPTY. Since the previous attempt was "recording"
that the token was coming from an exception by setting it to YYEMPTY,
instead of getting again the faulty token, we fetched another one.
Rather, revert to the first approach: map yytoken to "invalid token",
but record in yychar the fact that we come from an exception thrown in
the scanner.
* data/skeletons/glr.c (YYFAULTYTOK): New.
(yygetToken): Use it to record syntax errors from the scanner.
* tests/c++.at (Syntax error as exception): In addition to checking
syntax_error with error recovery, make sure it also behaves as
expected without.
The previous name was historical and inconsistent.
* src/muscle-tab.c (define_directive): Use the proper value passing
syntax, based on the muscle kind.
(muscle_percent_variable_update): Use the right value passing syntax.
Migrate from parser_class_name to api.parser.class.
* data/skeletons: Migrate from parser_class_name to api.parser.class.
* doc/bison.texi (%define Summary): Document both parser_class_name
and api.parser.class.
Promote the latter over the former.
This is very debatable. This function is not pure at all, so it could
stick to returning void: that's a common coding style to tell the
difference between "real" (pure) functions and side-effecting
subroutines. However, we already have this style elsewhere (e.g.,
yylex), and I feel the callers are somewhat nice to read this way.
* data/skeletons/glr.c (yygetLRActions): Return the action rather than
passing by pointer.
While at it, fix type of yytoken.
Adjust callers.
Reported by Askar Safin.
https://lists.gnu.org/archive/html/bison-patches/2019-01/msg00000.html
* data/skeletons/glr.c (yygetToken): Return YYEMPTY when an exception
is thrown.
* data/skeletons/lalr1.cc: Log when an exception is caught.
* tests/c++.at (Syntax error as exception): Be sure to recover from
error before triggering another error.
This way, it is easier to make sure its implementation is available in
glr.cc too, which is not the case currently.
* data/skeletons/c++.m4 (b4_public_types_define): Move the
implementation of syntax_error...
(b4_public_types_declare): here.
We used to create a short definition of yy::parser with all the
implementations of its member functions outside. But yy::parser is no
longer short and simple to read. Maintaining each function twice is
painful: a lot of redundancy but different indentation levels, output
which depends on whether we are in a header or not (see
d132c2d545), etc.
Let's simplify this and put the implementations into the class
definition itself.
Discussed in this monologue:
https://lists.gnu.org/archive/html/bison-patches/2018-12/msg00058.html.
* data/skeletons/c++.m4, data/skeletons/lalr1.cc,
* data/skeletons/variant.hh (b4_basic_symbol_constructor_define)
(_b4_token_constructor_declare, b4_token_constructor_declare)
Merge into...
(b4_basic_symbol_constructor_define, _b4_token_constructor_define)
(b4_token_constructor_define): these.
We used to define such auxiliary structures outside the class, mainly
as a matter of style to keep the definition of yy::parser short and
simple. However, now there's a lot more code generated inside the
class definition (e.g., all the token constructors), so the
readability no longer applies.
However, if we move stack (and slice) inside yy::parser, then it
should no longer be needed to change the namespace to have multiple
parsers: changing the class name should suffice.
One common argument against inner classes is that they code bloat. It
hardly applies here, since typically different parsers will have
different semantic value types, hence different actual stack types.
* data/skeletons/lalr1.cc: Invoke b4_stack_define inside yy::parser.
Suggested by Wolfgang Thaller.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00081.html
* data/c++.m4 (basic_symbol, by_type): Instead of provide either move
or copy constructor, always provide the copy one.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check it.
Currently the following piece of code crashes (with parse.assert),
because we don't record that s was moved-from, and we invoke its dtor.
{
auto s = parser::make_INT (42);
auto s2 = std::move (s);
}
Reported by Wolfgang Thaller.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00077.html
* data/c++.m4 (by_type): Provide a move-ctor.
(basic_symbol): Be sure not to read a moved-from value.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check this case.
Instead of introducing make_symbol (whose name, btw, somewhat
infringes on the user's "name space", if she defines a token named
"symbol"), let's make the construction of symbol_type safer, using
assertions.
For instance with:
%token ':' <std::string> ID <int> INT;
generate:
symbol_type (int token, const std::string&);
symbol_type (int token, const int&);
symbol_type (int token);
It does mean that now named token constructors (make_ID, make_INT,
etc.) go through a useless assert, but I think we can ignore this: I
assume any decent compiler will inline the symbol_type ctor inside the
make_TOKEN functions, which will show that the assert is trivially
verified, hence I expect no code will be emitted for it. And anyway,
that's an assert, NDEBUG controls it.
* data/c++.m4 (symbol_type): Turn into a subclass of
basic_symbol<by_type>.
Declare symbol constructors when variants are enabled.
* data/variant.hh (_b4_type_constructor_declare)
(_b4_type_constructor_define): Replace with...
(_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
Generate symbol_type constructors.
* doc/bison.texi (Complete Symbols): Document.
* tests/types.at: Check.
On
%token <int> FOO BAR
we currently generate make_FOO(int) and make_BAR(int). However, in
order to factor their scanners, some users would also like to have
make_symbol(tok, int), where tok is FOO or BAR. To ensure type
safety, add assertions that do check that value type and token type
match. Bind this assertion to the parse.assert %define variable.
Suggested by Frank Heckenbach.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00034.html
Should also match expectations from Аскар Сафин.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00023.html
* data/variant.hh: Use b4_token_visible_if where applicable.
(_b4_type_constructor_declare, _b4_type_constructor_define): New.
Use them.