On some systems (x86_64-pc-solaris2.11), with Developer Studio 12.5's
CC, we get:
".../include/CC/Cstd/vector.cc", line 127: Error: Cannot assign const yy::parser::stack_symbol_type to yy::parser::stack_symbol_type without "yy::parser::stack_symbol_type::operator=(const yy::parser::stack_symbol_type&)";.
".../include/CC/Cstd/vector", line 475: Where: While instantiating "std::vector<yy::parser::stack_symbol_type>::__insert_aux(yy::parser::stack_symbol_type*, const yy::parser::stack_symbol_type&)".
".../include/CC/Cstd/vector", line 475: Where: Instantiated from non-template code.
1 Error(s) detected.
Don't expect __cplusplus to be always defined. If it's not, consider
this is C++98.
Reported by Nelson H. F. Beebe.
* data/c++.m4, data/lalr1.cc, examples/c++/variant.yy, tests/local.at,
* tests/testsuite.h:
An undefined __cplusplus means pre C++11.
We use both styles, let's stick to a single one. Autoconf uses the
prefix one, let's do the same.
* data/bison.m4, data/c++.m4, data/c-like.m4, data/lalr1.cc,
* data/variant.hh, data/yacc.c: Rename all the b4_*_ macros
as _b4_*.
Reported by Frank Heckenbach.
http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00002.html
Actually the assignment operator should never be needed: the C++98
requirements for vector::push_back is CopyInsertable, which does not require
an assignment operator. However, libstdc++ shipped with GCC up to (and
including) 6 uses the assignment operator (which affects Clang on top of
libstdc++, but also ICC). So let's keep it for legacy C++.
See https://gcc.godbolt.org/z/q0XXmC.
* data/lalr1.cc (stack_symbol_type::operator=): Remove.
* data/c++.m4 (basic_symbol::operator=): Ditto.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Adjust.
Modern C++ (i.e., C++11 and later) introduced "move only" types: types such
as std::unique_ptr<T> that can never be duplicated. They must never be
copied (by assignments and constructors), they must be "moved". The
implementation of lalr1.cc used to copy symbols (including their semantic
values). This commit ensures that values are only moved in modern C++, yet
remain compatible with C++98/C++03.
Suggested by Frank Heckenbach, who provided a full implementation on
top of C++17's std::variant.
See http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00002.html,
and https://lists.gnu.org/archive/html/bison-patches/2018-04/msg00002.html.
Symbols (terminal/non terminal) are handled by several functions that used
to take const-refs, which resulted eventually in a copy pushed on the stack.
With modern C++ (C++11 and later) the callers must use std::move, and the
callees must take their arguments as rvalue refs (foo&&). In order to avoid
duplicating these functions to support both legacy C++ and modern C++, let's
introduce macros (YY_MOVE, YY_RVREF, etc.) that rely on copy-semantics for
C++98/03, and move-semantics for modern C++.
That's easy for inner types, when the parser's functions pass arguments to
each other. Functions facing the user (make_NUMBER, make_STRING, etc.)
should support both rvalue-refs (for instance to support move-only types:
make_INT (std::make_unique<int> (1))), and lvalue-refs (so that we can pass
a variable: make_INT (my_int)). To avoid the multiplication of the
signatures (there is also the location), let's take the argument by value.
See:
https://lists.gnu.org/archive/html/bison-patches/2018-09/msg00024.html.
* data/c++.m4 (b4_cxx_portability): New.
(basic_symbol): In C++11, replace copy-ctors with move-ctors.
In C++11, replace copies with moves.
* data/lalr1.cc (stack_symbol_type, yypush_): Likewise.
Use YY_MOVE to avoid useless copies.
* data/variant.hh (variant): Support move-semantics.
(make_SYMBOL): In C++11, in order to support both read-only lvalues,
and rvalues, take the argument as a copy.
* data/stack.hh (yypush_): Use rvalue-refs in C++11.
* tests/c++.at: Use move semantics.
* tests/headers.at: Adjust to the new macros (YY_MOVE, etc.).
* configure.ac (CXX98_CXXFLAGS, CXX11_CXXFLAGS, CXX14_CXXFLAGS)
(CXX17_CXXFLAGS, ENABLE_CXX11): New.
* tests/atlocal.in: Receive them.
* examples/variant.yy: Don't define things in std.
* examples/variant-11.test, examples/variant-11.yy: New.
Check the support of move-only types.
* examples/README, examples/local.mk: Adjust.
This generates less code, which is nicer to read, but also takes less
chances with compilers such as G++ 4.8 that are too strict and check
"dead code" (templated code that is not instantiated).
* data/c++.m4 (b4_symbol_type_declare, b4_symbol_type_define): When
variants are used, don't generate code meant for non variants.
Currently, in glr.cc, we emit the definitions of basic_symbol and
symbol_type, although there are not used.
* data/c++.m4 (b4_public_types_declare): Extract these definitions from
here, and move them...
(b4_symbol_type_declare): here.
(b4_public_types_declare): Also remove the definition of the symbol
constructors.
* data/lalr1.cc (b4_shared_declarations): Adjust: call
b4_symbol_type_declare and b4_symbol_constructor_declare.
In 0931d14728 I removed too many
initializations from some ctors: some were not about base ctors, but
about member variables. In fact, more of them were missing to please
GCC 8.
While at it, generate more natural code for C++ without variant:
instead of
template <typename Base>
parser::basic_symbol<Base>::basic_symbol (const basic_symbol& other)
: Base (other)
, value ()
{
value = other.value
}
generate
template <typename Base>
parser::basic_symbol<Base>::basic_symbol (const basic_symbol& other)
: Base (other)
, value (other.value)
{}
* data/c++.m4 (basic_symbol::basic_symbol): Always initialize 'value',
it might be a POD without a ctor.
* data/lalr1.cc (stack_symbol_type::stack_symbol_type): Likewise.
* data/variant.hh (variant::variant): Default initialize the buffer too.
GCC 8 issues warnings whose root cause was a bit hard to find.
calc.cc: In member function 'virtual int yy::parser::parse()':
calc.cc:810:18: warning: '*((void*)&<anonymous> +8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
, location (l)
^
calc.cc: In member function 'void yy::parser::yypush_(const char*, yy::parser::stack_symbol_type&)':
calc.cc:810:18: warning: '*((void*)&<anonymous> +8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
, location (l)
^
calc.cc: In member function 'void yy::parser::yypush_(const char*, yy::parser::state_type, yy::parser::symbol_type&)':
calc.cc:810:18: warning: '*((void*)&<anonymous> +8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
, location (l)
^
The problem is with locations that don't have a constructor, such as
Span (in calc.cc) which is POD. It is POD on purpose: so that we can
use that structure to test glr.cc which cannot use non POD in its
(C) stacks.
* data/c++.m4 (basic_symbol): Also ensure that 'location' is
initialized.
Sometimes `inline` would be used in *.cc files on symbols that are not
exported (useless but harmless), and sometimes on exported symbols
such as the constructor of syntax_error (harmful: linking fails).
Reported several times, including:
- by Dennis T
http://lists.gnu.org/archive/html/bug-bison/2016-03/msg00002.html
- by Frank Heckenbach
https://savannah.gnu.org/patch/?9616
* data/c++.m4 (b4_inline): New: expands to `inline` or nothing.
Use it where appropriate.
* data/lalr1.cc: Use it where appropriate.
* tests/c++.at (Syntax error as exception): Put the scanner in another
compilation unit to exercise the constructor of syntax_error.
When using variants, destructors generate invalid code.
<http://lists.gnu.org/archive/html/bug-bison/2014-09/msg00005.html>
Reported by Michael Catanzaro.
* data/c++.m4 (~basic_symbol): b4_symbol_foreach works on yysym:
define it.
* tests/c++.at (Variants): Check it.
The symbol destructor is currently the only means to clear a symbol.
Unfortunately during error recovery we might have to clear the
lookahead, which is a local variable (yyla) that has not yet reached
its end of scope.
Rather that duplicating the code to destroy a symbol, or rather than
destroying and recreating yyla, let's provide a means to clear a
symbol.
Reported by Antonio Silva Correia, with an analysis from Michel d'Hooge.
<http://savannah.gnu.org/support/?108481>
* data/c++.m4, data/lalr1.cc (basis_symbol::clear, by_state::clear)
(by_type::clear): New.
(basic_symbol::~basic_symbol): Use clear.
Bison supports a union tag, for obscure reasons. But it does a poor
job at it, especially since Bison 3.0.
Reported by Stephen Cameron and Tobias Frost.
It did not ensure that the name was not given several times. An easy
way to do this is to make the %union tag be handled as a %define
variable, as they cannot be defined several times.
Since Bison 3.0, the synclines were wrongly placed, resulting in
invalid code. Addressing this issue, because of the way the union tag
was stored (as a code muscle), would have been tedious. Unless we
rather define the %union tag as a %percent variable, whose synclines
are easier to manipulate.
So replace the b4_union_name muscle by the api.value.union.name
%define variable, document, and check.
* data/bison.m4: Make sure that api.value.union.name has a keyword value.
* data/c++.m4: Make sure that api.value.union.name is not defined.
* data/c.m4 (b4_union_name): No longer use it, use api.value.union.name.
* doc/bison.texi (%define Summary): Document it.
* src/parse-gram.y (union_name): No longer define b4_uion_name, but
api.value.union.name.
* tests/input.at (Redefined %union name): New.
* tests/synclines.at (%union name syncline): New.
* tests/types.at: Check named %unions.
* data/bison.m4 (b4_percent_define_check_kind): New.
Use it to check api.token.prefix.
* data/c++.m4: Check the kind of api.namespace.
* doc/bison.texi: Update a reference to former 'namespace' variable.
* tests/input.at ("%define" code variables): Check api.namespace.
Suggested by Joel E. Denny.
http://lists.gnu.org/archive/html/bison-patches/2013-03/msg00016.html
* data/bison.m4 (b4_percent_define_get_kind): New.
(b4_variant_flag): Check that api.value.type is defined as the 'variant'
keyword value.
* data/c.m4 (_b4_value_type_setup_keyword): New.
(b4_value_type_setup): Use it to simplify reading.
Use b4_define_silent.
Decode api.value.type, including its type.
(b4_value_type_define): Likewise.
* data/c++.m4 (b4_value_type_declare): Adjust the decoding of api.value.type,
taking its kind into account.
* doc/bison.texi: Adjust all the examples to the new syntax.
* NEWS: Ditto.
* tests/types.at: Adjust
* data/c.m4 (b4_symbol_type_register, b4_type_define_tag)
(b4_symbol_value_union, b4_value_type_setup_union)
(b4_value_type_setup_variant, b4_value_type_setup):
New.
(b4_value_type_define): Use it to set up properly the type.
Handle the various possible values of api.value.type.
* data/c++.m4 (b4_value_type_declare): Likewise.
* data/lalr1.cc (b4_value_type_setup_variant): Redefine.
* tests/types.at: New.
Exercise all the C/C++ skeletons with different types of
api.value.type values.
* tests/local.mk, tests/testsuite.at: Use it.
* doc/bison.texi (%define Summary): Document api.value.type.
* NEWS: Advertise it, together with api.token.constructor.
This is to match the names used in C and api.value.type, even if the
parser actually defines semantic_type.
* data/c++.m4 (b4_semantic_type_declare): Rename as...
(b4_value_type_declare): this.
* data/variant.hh: Likewise.
"stype" is quite unclear, and it also collides with the former %define
variable that had the same name (replaced by api.value.type).
* src/parse-gram.y (stype): Rename as...
(union_members): this.
* data/bison.m4: Adjust.
(b4_user_stype): Rename as...
(b4_user_union_members): this.
* data/c++.m4, data/c.m4: Adjust.
* src/parse-gram.c: regen.
Recently, there was a slightly vicious bug hidden in the make_ functions:
parser::symbol_type
parser::make_TEXT (const ::std::string& v)
{
return symbol_type (token::TOK_TEXT, v);
}
The constructor for symbol_type doesn't take an ::std::string& as
argument, but a constant variant. However, because there is a variant
constructor which takes an ::std::string&, this caused the implicit
construction of a built variant. Considering that the variant argument
for the symbol_type constructor was cv-qualified, this temporary variant
was never destroyed.
As a temporary solution, the symbol was built in two stages:
symbol_type res (token::TOK_TEXT);
res.value.build< ::std::string&> (v);
return res;
However, the solution introduced in this patch contributes to letting
the symbols handle themselves, by supplying them with constructors that
take a non-variant value and build the symbol's own variant with that
value.
* data/variant.hh (b4_symbol_constructor_define_): Use the new
constructors rather than building in a temporary symbol.
(b4_basic_symbol_constructor_declare,
b4_basic_symbol_constructor_define): New macros generating the
constructors.
* data/c++.m4 (basic_symbol): Invoke the macros here.
* data/c++.m4, data/lalr1.cc (by_state, by_type): Do not use -1 to
denote the absence of value, as GCC then fears that this -1 might
be used to dereference arrays (such as yytname).
Use 0, which corresponds to $accept, which is valueless (the needed
property: the symbol destructor must not try to reclaim the memory
associated with the symbol).
* data/c++.m4 (b4_public_types_declare): Declare token_number_type soon.
Introduce symbol_number_type (wider than token_number_type).
Clarify the requirement that kind_type from by_state and by_type
denote the _input_ type (required by the constructor), not the stored type.
Use symbol_number_type and token_number_type where appropriate, instead
of int.
* data/lalr1.cc: Adjust to these changes.
Propagate "symbol_number_type".
Invoke "type_get ()" instead of read "type" directly.
Many 'inline' keywords were in the declarations. They rather belong in
definitions, so move them.
* data/c++.m4 (basic_symbol, by_type): Many inlines here.
* data/lalr1.cc (yytranslate_, yy_destroy_, by_state, yypush_, yypop_): Inline
these as well.
(move): Move the definition outside the struct, where it belongs.
Now that symbols behaves properly, we can eliminate special routines
that are no longer needed.
* data/c++.m4, data/glr.cc, data/lalr1.cc, data/variant.hh:
Remove useless assignment operators and copy constructors.
As a consequence, remove useless includes for "abort".
The current approach was too adhoc: the symbols were not sufficiently
self-contained, in particular wrt memory management. The "new"
guideline is the one that should have been followed from the start:
let the symbols handle themslves, instead of leaving their users to
it. It was justified by the will to avoid gratuitious moves and
copies, but the current approach does not seem to be slower, yet it
will probably be simpler to adjust to support move semantics from
C++11.
The documentation says that the %parse-param are available from the
%destructor. In retrospect, that was a silly design decision, which
we can break for variants, as its a new feature. It should be phased
out for non-variants too.
* data/variant.hh: A variant never knows if it stores something or
not, it is up to its users to store this information.
Yet, in parse.assert mode, make sure the empty/filled variants
are properly used.
(b4_symbol_constructor_define_): Don't call directly the symbol
constructor, to save a useless temporary.
* data/stack.hh (push): Steal the pushed value instead of duplicating
it.
This will simplify the callers of push, who handled this "move"
approach themselves.
* data/c++.m4 (basic_symbol): Let -1, as kind, denote the fact that
a symbol is empty.
This is needed for instance when shifting the lookahead: yyla
is given as argument to "push", and its value is then moved on
the stack. But then yyla must be declared "empty" so that its
destructor won't be called.
(basic_symbol::move): New.
Move the responsibility of calling the destructor from yy_destroy
to ~basic_symbol in the case of variants.
* data/lalr1.cc (stack_symbol_type): Now a derived class from its
previous value, so that we can add a constructor from a symbol_type.
(by_state): State -1 means empty.
(yypush_): Factor, by calling one overload from the other one, and
using the new semantics of stack::push.
No longer reclaim by hand the memory from rhs symbols, since now
that we store objects with proper destructors, they will be reclaimed
automatically.
Conversely, be sure to delete yylhs.
* tests/c++.at (C++ Variant-based Symbols): New "unit" test for
symbols.
* data/c++.m4 (basic_symbol): Keep 'inline' in the prototypes, but don't
duplicate it in the implementation.
* data/variant.hh (variant): 'inline' is not needed when the implementation is
provided in the class definition.
* data/variant.hh (variant, operator=): Make private.
* data/c++.m4 (operator=): New, to avoid needing a definition of that operator
for each class member (such as a possible variant).
* data/glr.cc, data/lalr.cc: Add the necessary include for the abort.