When we introduced variants in Bison, C++ did not have the 'emplace'
functions, and we chose 'build'. Let's align with modern C++ and
promote 'emplace' rather than 'build'.
* data/lalr1.cc, data/variant.hh (emplace): New.
(build): Deprecate in favor of emplace.
* doc/bison.texi: Adjust.
We use both styles, let's stick to a single one. Autoconf uses the
prefix one, let's do the same.
* data/bison.m4, data/c++.m4, data/c-like.m4, data/lalr1.cc,
* data/variant.hh, data/yacc.c: Rename all the b4_*_ macros
as _b4_*.
Modern C++ (i.e., C++11 and later) introduced "move only" types: types such
as std::unique_ptr<T> that can never be duplicated. They must never be
copied (by assignments and constructors), they must be "moved". The
implementation of lalr1.cc used to copy symbols (including their semantic
values). This commit ensures that values are only moved in modern C++, yet
remain compatible with C++98/C++03.
Suggested by Frank Heckenbach, who provided a full implementation on
top of C++17's std::variant.
See http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00002.html,
and https://lists.gnu.org/archive/html/bison-patches/2018-04/msg00002.html.
Symbols (terminal/non terminal) are handled by several functions that used
to take const-refs, which resulted eventually in a copy pushed on the stack.
With modern C++ (C++11 and later) the callers must use std::move, and the
callees must take their arguments as rvalue refs (foo&&). In order to avoid
duplicating these functions to support both legacy C++ and modern C++, let's
introduce macros (YY_MOVE, YY_RVREF, etc.) that rely on copy-semantics for
C++98/03, and move-semantics for modern C++.
That's easy for inner types, when the parser's functions pass arguments to
each other. Functions facing the user (make_NUMBER, make_STRING, etc.)
should support both rvalue-refs (for instance to support move-only types:
make_INT (std::make_unique<int> (1))), and lvalue-refs (so that we can pass
a variable: make_INT (my_int)). To avoid the multiplication of the
signatures (there is also the location), let's take the argument by value.
See:
https://lists.gnu.org/archive/html/bison-patches/2018-09/msg00024.html.
* data/c++.m4 (b4_cxx_portability): New.
(basic_symbol): In C++11, replace copy-ctors with move-ctors.
In C++11, replace copies with moves.
* data/lalr1.cc (stack_symbol_type, yypush_): Likewise.
Use YY_MOVE to avoid useless copies.
* data/variant.hh (variant): Support move-semantics.
(make_SYMBOL): In C++11, in order to support both read-only lvalues,
and rvalues, take the argument as a copy.
* data/stack.hh (yypush_): Use rvalue-refs in C++11.
* tests/c++.at: Use move semantics.
* tests/headers.at: Adjust to the new macros (YY_MOVE, etc.).
* configure.ac (CXX98_CXXFLAGS, CXX11_CXXFLAGS, CXX14_CXXFLAGS)
(CXX17_CXXFLAGS, ENABLE_CXX11): New.
* tests/atlocal.in: Receive them.
* examples/variant.yy: Don't define things in std.
* examples/variant-11.test, examples/variant-11.yy: New.
Check the support of move-only types.
* examples/README, examples/local.mk: Adjust.
Currently, in bison's C++ parser template (`lalr.cc`), the `variant<>`
struct's `build()` method uses placement-new in the form `new (...) T`
to initialize a variant type. However, for POD variant types, this
will leave the memory space uninitialized. If we subsequently tries
to `::move` into a variant object in such state, the call can trigger
clang's undefined behavior sanitizer due to accessing the
uninitialized memory.
https://lists.gnu.org/archive/html/bison-patches/2018-08/msg00098.html
* data/variant.hh (build): Always initialize the stored value.
Signed-off-by: Akim Demaille <akim@lrde.epita.fr>
In 0931d14728 I removed too many
initializations from some ctors: some were not about base ctors, but
about member variables. In fact, more of them were missing to please
GCC 8.
While at it, generate more natural code for C++ without variant:
instead of
template <typename Base>
parser::basic_symbol<Base>::basic_symbol (const basic_symbol& other)
: Base (other)
, value ()
{
value = other.value
}
generate
template <typename Base>
parser::basic_symbol<Base>::basic_symbol (const basic_symbol& other)
: Base (other)
, value (other.value)
{}
* data/c++.m4 (basic_symbol::basic_symbol): Always initialize 'value',
it might be a POD without a ctor.
* data/lalr1.cc (stack_symbol_type::stack_symbol_type): Likewise.
* data/variant.hh (variant::variant): Default initialize the buffer too.
Fix a typo so that instead of
basic_symbol::basic_symbol (typename Base::kind_type t, const int v)
we now generate
basic_symbol::basic_symbol (typename Base::kind_type t, const int& v)
* data/variant.hh (b4_basic_symbol_constructor_declare)
(b4_basic_symbol_constructor_define): Add missing reference.
Instead of storing and comparing pointers to names of types, store
pointers to the typeids, and compares the typeids.
Reported by Thomas Jahns.
<http://lists.gnu.org/archive/html/bug-bison/2014-03/msg00001.html>
* data/variant.hh (yytname_): Replace with...
(yytypeid_): this.
This is to match the names used in C and api.value.type, even if the
parser actually defines semantic_type.
* data/c++.m4 (b4_semantic_type_declare): Rename as...
(b4_value_type_declare): this.
* data/variant.hh: Likewise.
The changes by Théophile Ranquet about type punning issues need
to be extend to in-place new to please G++ 4.4.7.
* data/variant.hh (variant::as_): New, factors the casts that avoid
compiler warnings.
(as, build): Use them.
This is based on what is recommended by both Scott Meyers, in 'Effective
C++', and Andrei Alexandrescu and Herb Sutter in 'C++ Coding Standards'.
Use a static_cast on void* rather than directly use a reinterpret_cast,
which can have nefarious effects on objects. However, even though following
this guideline is good practice in general, I am not quite sure how relevant
it is when applied to conversions from POD to objects. Actually, it might
very well be the opposite: isn't this exactly what reinterpret_cast is for?
What we really want *is* to transmit the memory map as a series of bytes,
which, if I am correct, falls into the kind of "low level" hack for which
this cast is meant.
In any case, this silences the warning, which will be greatly appreciated by
anyone using variants with a compiler supporting -fstrict-aliasing.
* data/variant.hh (as): Here.
* tests/c++.at (Exception safety, C++ Variant-based Symbols, Variants):
Don't use NO_STRICT_ALIAS_CXXFLAGS (revert commit ddb9db15), as type punning
is no longer an issue.
* tests/atlocal.in, configure.ac (NO_STRICT_ALIAS_CXXFLAGS): Remove
definition.
* examples/local.mk (NO_STRICT_ALIAS_CXXFLAGS): Remove from AM_CXXFLAGS.
* doc/bison.texi: Don't mention type punning issues.
When using %define parse.assert, the variants come with additional variables
that are useful for development purposes. One is a Boolean indicating if the
variant is built (to make sure we don't read a non-built variant), and the
other is a string describing the stored type. There is no need to have both of
these, the string is enough.
* data/variant.hh (built): Remove.
Recently, there was a slightly vicious bug hidden in the make_ functions:
parser::symbol_type
parser::make_TEXT (const ::std::string& v)
{
return symbol_type (token::TOK_TEXT, v);
}
The constructor for symbol_type doesn't take an ::std::string& as
argument, but a constant variant. However, because there is a variant
constructor which takes an ::std::string&, this caused the implicit
construction of a built variant. Considering that the variant argument
for the symbol_type constructor was cv-qualified, this temporary variant
was never destroyed.
As a temporary solution, the symbol was built in two stages:
symbol_type res (token::TOK_TEXT);
res.value.build< ::std::string&> (v);
return res;
However, the solution introduced in this patch contributes to letting
the symbols handle themselves, by supplying them with constructors that
take a non-variant value and build the symbol's own variant with that
value.
* data/variant.hh (b4_symbol_constructor_define_): Use the new
constructors rather than building in a temporary symbol.
(b4_basic_symbol_constructor_declare,
b4_basic_symbol_constructor_define): New macros generating the
constructors.
* data/c++.m4 (basic_symbol): Invoke the macros here.
Now that symbols behaves properly, we can eliminate special routines
that are no longer needed.
* data/c++.m4, data/glr.cc, data/lalr1.cc, data/variant.hh:
Remove useless assignment operators and copy constructors.
As a consequence, remove useless includes for "abort".
The current approach was too adhoc: the symbols were not sufficiently
self-contained, in particular wrt memory management. The "new"
guideline is the one that should have been followed from the start:
let the symbols handle themslves, instead of leaving their users to
it. It was justified by the will to avoid gratuitious moves and
copies, but the current approach does not seem to be slower, yet it
will probably be simpler to adjust to support move semantics from
C++11.
The documentation says that the %parse-param are available from the
%destructor. In retrospect, that was a silly design decision, which
we can break for variants, as its a new feature. It should be phased
out for non-variants too.
* data/variant.hh: A variant never knows if it stores something or
not, it is up to its users to store this information.
Yet, in parse.assert mode, make sure the empty/filled variants
are properly used.
(b4_symbol_constructor_define_): Don't call directly the symbol
constructor, to save a useless temporary.
* data/stack.hh (push): Steal the pushed value instead of duplicating
it.
This will simplify the callers of push, who handled this "move"
approach themselves.
* data/c++.m4 (basic_symbol): Let -1, as kind, denote the fact that
a symbol is empty.
This is needed for instance when shifting the lookahead: yyla
is given as argument to "push", and its value is then moved on
the stack. But then yyla must be declared "empty" so that its
destructor won't be called.
(basic_symbol::move): New.
Move the responsibility of calling the destructor from yy_destroy
to ~basic_symbol in the case of variants.
* data/lalr1.cc (stack_symbol_type): Now a derived class from its
previous value, so that we can add a constructor from a symbol_type.
(by_state): State -1 means empty.
(yypush_): Factor, by calling one overload from the other one, and
using the new semantics of stack::push.
No longer reclaim by hand the memory from rhs symbols, since now
that we store objects with proper destructors, they will be reclaimed
automatically.
Conversely, be sure to delete yylhs.
* tests/c++.at (C++ Variant-based Symbols): New "unit" test for
symbols.
* data/c++.m4 (basic_symbol): Keep 'inline' in the prototypes, but don't
duplicate it in the implementation.
* data/variant.hh (variant): 'inline' is not needed when the implementation is
provided in the class definition.
* data/variant.hh (variant, operator=): Make private.
* data/c++.m4 (operator=): New, to avoid needing a definition of that operator
for each class member (such as a possible variant).
* data/glr.cc, data/lalr.cc: Add the necessary include for the abort.
The "variant" structure provides a means to store, in a typeless way,
C++ objects. Manipulating it without provide the type of the stored
content is doomed to failure. So provide a means to copy in a type
safe way, and prohibit typeless assignments.
* data/c++.m4 (symbol_type::move): New.
* data/lalr1.cc: Use it.
* data/variant.hh (b4_variant_define): Provide variant::copy.
Let variant::operator= abort.
We cannot undefine it, yet, as it is still uses by the implicit
assigment in symbols, which must also be disabled.
Equip variants with more checking code. Provide a means to request
includes.
* data/variant.hh (b4_variant_includes): New.
* data/lalr1.cc: Use it.
* data/variant.hh (variant::built): Define at the end, as a private member.
(variant::tname): New.
Somewhat makes "built" useless, but let's keep both for a start, in
case using "typeinfo" is considered unacceptable in some environments.
Fix some formatting issues.
* data/c++.m4, data/lalr1.cc (parser::symbol_type): Change the
constructor to take a token_type instead of the (internal) symbol
number.
Call yytranslate_.
* data/variant.hh (b4_symbol_constructor_define_): Therefore,
don't call yytranslate_ here.