We use both styles, let's stick to a single one. Autoconf uses the
prefix one, let's do the same.
* data/bison.m4, data/c++.m4, data/c-like.m4, data/lalr1.cc,
* data/variant.hh, data/yacc.c: Rename all the b4_*_ macros
as _b4_*.
The CI is littered with
# -*- compilation -*-
423. regression.at:907: testing Dancer %glr-parser ...
./regression.at:907: bison -fno-caret -o dancer.c dancer.y
./regression.at:907: $BISON_C_WORKS
stderr:
stdout:
./regression.at:907: $CC $CFLAGS $CPPFLAGS $LDFLAGS -o dancer dancer.c $LIBS
stderr:
icc: command line warning #10006: ignoring unknown option '-Wcast-align'
icc: command line warning #10006: ignoring unknown option '-fparse-all-comments'
icc: command line warning #10006: ignoring unknown option '-Wdocumentation'
icc: command line warning #10006: ignoring unknown option '-Wnull-dereference'
icc: command line warning #10006: ignoring unknown option '-Wbad-function-cast'
icc: command line warning #10006: ignoring unknown option '-fno-color-diagnostics'
icc: command line warning #10006: ignoring unknown option '-Wno-keyword-macro'
dancer.c(755): error #1628: function declared with "noreturn" does return
}
^
dancer.c(761): error #1628: function declared with "noreturn" does return
}
^
compilation aborted for dancer.c (code 2)
ICC sees that `longjmp(buf, 1);` does not return, it sees that
`abort();` does not either, but fails to see it for
`longjmp(buf, 1); abort();`
* data/glr.c (YYLONGJMP): Be even clearer on the fact this does not
return.
In C++ pre C++11 it is standard practice to use 0 for the null pointer.
But GCC pre 8 -std=c++98 with -Wzero-as-null-pointer-constant warns about
this.
So disable -Wzero-as-null-pointer-constant when compiling C++ pre 11.
Let's do this in AT_DATA_SOURCE_PROLOGUE (which is pasted on top of
all the test grammar files). Unfortunately, that shifts all the
locations in the expected error messages, which would be too noisy.
Instead, let's introduce testsuite.h, which can vary in length, and
include it in AT_DATA_SOURCE_PROLOGUE.
* tests/testsuite.h: New.
Disable -Wzero-as-null-pointer-constant's warning with GCC pre 8,
C++ pre 11.
* tests/local.at (AT_DATA_SOURCE_PROLOGUE): Use it.
* tests/atlocal.in (CPPFLAGS): Find it.
* tests/local.mk: Ship it.
* data/c.m4 (YY_NULLPTR): Prefer ((void*)0) to 0 in C.
Reported by Frank Heckenbach.
http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00002.html
Actually the assignment operator should never be needed: the C++98
requirements for vector::push_back is CopyInsertable, which does not require
an assignment operator. However, libstdc++ shipped with GCC up to (and
including) 6 uses the assignment operator (which affects Clang on top of
libstdc++, but also ICC). So let's keep it for legacy C++.
See https://gcc.godbolt.org/z/q0XXmC.
* data/lalr1.cc (stack_symbol_type::operator=): Remove.
* data/c++.m4 (basic_symbol::operator=): Ditto.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Adjust.
Modern C++ (i.e., C++11 and later) introduced "move only" types: types such
as std::unique_ptr<T> that can never be duplicated. They must never be
copied (by assignments and constructors), they must be "moved". The
implementation of lalr1.cc used to copy symbols (including their semantic
values). This commit ensures that values are only moved in modern C++, yet
remain compatible with C++98/C++03.
Suggested by Frank Heckenbach, who provided a full implementation on
top of C++17's std::variant.
See http://lists.gnu.org/archive/html/bug-bison/2018-03/msg00002.html,
and https://lists.gnu.org/archive/html/bison-patches/2018-04/msg00002.html.
Symbols (terminal/non terminal) are handled by several functions that used
to take const-refs, which resulted eventually in a copy pushed on the stack.
With modern C++ (C++11 and later) the callers must use std::move, and the
callees must take their arguments as rvalue refs (foo&&). In order to avoid
duplicating these functions to support both legacy C++ and modern C++, let's
introduce macros (YY_MOVE, YY_RVREF, etc.) that rely on copy-semantics for
C++98/03, and move-semantics for modern C++.
That's easy for inner types, when the parser's functions pass arguments to
each other. Functions facing the user (make_NUMBER, make_STRING, etc.)
should support both rvalue-refs (for instance to support move-only types:
make_INT (std::make_unique<int> (1))), and lvalue-refs (so that we can pass
a variable: make_INT (my_int)). To avoid the multiplication of the
signatures (there is also the location), let's take the argument by value.
See:
https://lists.gnu.org/archive/html/bison-patches/2018-09/msg00024.html.
* data/c++.m4 (b4_cxx_portability): New.
(basic_symbol): In C++11, replace copy-ctors with move-ctors.
In C++11, replace copies with moves.
* data/lalr1.cc (stack_symbol_type, yypush_): Likewise.
Use YY_MOVE to avoid useless copies.
* data/variant.hh (variant): Support move-semantics.
(make_SYMBOL): In C++11, in order to support both read-only lvalues,
and rvalues, take the argument as a copy.
* data/stack.hh (yypush_): Use rvalue-refs in C++11.
* tests/c++.at: Use move semantics.
* tests/headers.at: Adjust to the new macros (YY_MOVE, etc.).
* configure.ac (CXX98_CXXFLAGS, CXX11_CXXFLAGS, CXX14_CXXFLAGS)
(CXX17_CXXFLAGS, ENABLE_CXX11): New.
* tests/atlocal.in: Receive them.
* examples/variant.yy: Don't define things in std.
* examples/variant-11.test, examples/variant-11.yy: New.
Check the support of move-only types.
* examples/README, examples/local.mk: Adjust.
This generates less code, which is nicer to read, but also takes less
chances with compilers such as G++ 4.8 that are too strict and check
"dead code" (templated code that is not instantiated).
* data/c++.m4 (b4_symbol_type_declare, b4_symbol_type_define): When
variants are used, don't generate code meant for non variants.
Currently, in glr.cc, we emit the definitions of basic_symbol and
symbol_type, although there are not used.
* data/c++.m4 (b4_public_types_declare): Extract these definitions from
here, and move them...
(b4_symbol_type_declare): here.
(b4_public_types_declare): Also remove the definition of the symbol
constructors.
* data/lalr1.cc (b4_shared_declarations): Adjust: call
b4_symbol_type_declare and b4_symbol_constructor_declare.
Benchmarks show that it is more efficient to keep this copy
constructor, rather than forcing the use of the default constructor
and then assignment.
This reverts commit 7ab25ad020.
Currently, in bison's C++ parser template (`lalr.cc`), the `variant<>`
struct's `build()` method uses placement-new in the form `new (...) T`
to initialize a variant type. However, for POD variant types, this
will leave the memory space uninitialized. If we subsequently tries
to `::move` into a variant object in such state, the call can trigger
clang's undefined behavior sanitizer due to accessing the
uninitialized memory.
https://lists.gnu.org/archive/html/bison-patches/2018-08/msg00098.html
* data/variant.hh (build): Always initialize the stored value.
Signed-off-by: Akim Demaille <akim@lrde.epita.fr>
Reported by Brooks Moses <bmoses@google.com>
http://lists.gnu.org/archive/html/bison-patches/2018-02/msg00000.html
* data/lalr1.cc (YY_EXCEPTIONS): New.
Use it to disable try/catch clauses.
* doc/bison.texi (C++ Parser Interface): Document it.
* configure.ac (CXXFLAGS_NO_EXCEPTIONS): New.
* tests/atlocal.in: Receive it.
* tests/local.at (AT_FULL_COMPILE, AT_LANG_COMPILE):
Accept a new argument, extra compiler flags.
* tests/calc.at: Run the C++ calculator with exception support disabled.
Visual Studio issues a C4146 warning on '-static_cast<unsigned>(rhs)'.
The code is weird, probably to cope with INT_MIN. Let's go back to
using std::max (whose header is still included in position.hh...) like
originally, but with the needed casts.
Reported by 長田偉伸, and with help from Rici Lake.
See also
http://lists.gnu.org/archive/html/bug-bison/2013-02/msg00000.html
and commit 75ae829984.
* data/location.cc (position::add_): Take min as an int.
Use std::max.
While here, get rid of a couple of useless inlines.
Reported by Jannick.
http://lists.gnu.org/archive/html/bug-bison/2017-05/msg00001.html
"Amusingly" enough, we have the same problem with %defines when the
parser file name has backslashes or quotes: we generate #includes with
an incorrect C string.
* src/output.c (prepare_symbol_definitions): Escape properly the file
names before passing them to M4.
* data/bison.m4, data/lalr1.cc: Don't simply put the file name between
two quotes (that should have been strong enough a smell...), expect
the string to be properly quoted.
* tests/synclines.at: New tests to check this.
In 0931d14728 I removed too many
initializations from some ctors: some were not about base ctors, but
about member variables. In fact, more of them were missing to please
GCC 8.
While at it, generate more natural code for C++ without variant:
instead of
template <typename Base>
parser::basic_symbol<Base>::basic_symbol (const basic_symbol& other)
: Base (other)
, value ()
{
value = other.value
}
generate
template <typename Base>
parser::basic_symbol<Base>::basic_symbol (const basic_symbol& other)
: Base (other)
, value (other.value)
{}
* data/c++.m4 (basic_symbol::basic_symbol): Always initialize 'value',
it might be a POD without a ctor.
* data/lalr1.cc (stack_symbol_type::stack_symbol_type): Likewise.
* data/variant.hh (variant::variant): Default initialize the buffer too.
GCC 8 issues warnings whose root cause was a bit hard to find.
calc.cc: In member function 'virtual int yy::parser::parse()':
calc.cc:810:18: warning: '*((void*)&<anonymous> +8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
, location (l)
^
calc.cc: In member function 'void yy::parser::yypush_(const char*, yy::parser::stack_symbol_type&)':
calc.cc:810:18: warning: '*((void*)&<anonymous> +8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
, location (l)
^
calc.cc: In member function 'void yy::parser::yypush_(const char*, yy::parser::state_type, yy::parser::symbol_type&)':
calc.cc:810:18: warning: '*((void*)&<anonymous> +8)' may be used uninitialized in this function [-Wmaybe-uninitialized]
, location (l)
^
The problem is with locations that don't have a constructor, such as
Span (in calc.cc) which is POD. It is POD on purpose: so that we can
use that structure to test glr.cc which cannot use non POD in its
(C) stacks.
* data/c++.m4 (basic_symbol): Also ensure that 'location' is
initialized.
This file was meant to be shown as an example. Install it.
* README, data/README: Put Emacs metadata in the final section.
* examples/README: New.
* examples/variant.yy: Use %empty.
* examples/local.mk: Install both these files.
We currently generate copy constructors such as the following
one (taken from examples/variant.yy):
parser::stack_symbol_type::stack_symbol_type (const stack_symbol_type& that)
: super_type (that.state, that.location)
{
switch (that.type_get ())
{
case 3: // TEXT
case 8: // item
value.copy< ::std::string > (that.value);
break;
case 7: // list
value.copy< ::std::vector<std::string> > (that.value);
break;
case 4: // NUMBER
value.copy< int > (that.value);
break;
default:
break;
}
}
they are actually useless: we never need it.
* data/lalr1.cc: Don't generate the stack_symbol_type copy ctor.
Fix a typo so that instead of
basic_symbol::basic_symbol (typename Base::kind_type t, const int v)
we now generate
basic_symbol::basic_symbol (typename Base::kind_type t, const int& v)
* data/variant.hh (b4_basic_symbol_constructor_declare)
(b4_basic_symbol_constructor_define): Add missing reference.
Predicates with GLR are issued with synclines in the middle of C code:
case 2:
if (! (#line 6 "sempred.y" /* glr.c:816 */
new_syntax)) YYERROR;
#line 793 "sempred.tab.c" /* glr.c:816 */
break;
Reported by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-05/msg00033.html
* data/c.m4 (b4_predicate_case): Be sure to start on column 0.
It would be nicer if b4_syncline could ensure this by itself
(that would avoid ugly code when synclines are disabled), but that's
way more work.
* tests/glr-regression.at (Predicates): Be a real end-to-end test.
This would have caught this error years ago...
For instance on test 99:
In file included from @@.cc:56:
@@.hh:409:26: error: definition of implicit copy constructor for
'stack_symbol_type' is deprecated because it
has a user-declared copy assignment operator
[-Werror,-Wdeprecated]
stack_symbol_type& operator= (const stack_symbol_type& that);
^
Reported by Derek Clegg.
https://lists.gnu.org/archive/html/bison-patches/2018-05/msg00036.html
* configure.ac (warn_tests): Add -Wdeprecated.
* data/lalr1.cc (stack_symbol_type): Add an explicit copy ctor.
We cannot rely on the explicit default implementation (`= default`)
as we support C++ 98.
At least GCC 7.3, with -O1 or -O2 (but not -O0 or -O3) generates
warnings with -Wnull-dereference when using yyformat: it fails to see
yyformat cannot be null.
Reported by Frank Heckenbach, https://savannah.gnu.org/patch/?9620.
* configure.ac: Use -Wnull-dereference if supported.
* data/glr.c, data/lalr1.cc, data/yacc.c: Define yyformat in such
a way that GCC cannot not see that yyformat is defined.
Using `default: abort();` also addresses the issue, but forces
the inclusion of `stdlib.h`, which we avoid.
Sometimes `inline` would be used in *.cc files on symbols that are not
exported (useless but harmless), and sometimes on exported symbols
such as the constructor of syntax_error (harmful: linking fails).
Reported several times, including:
- by Dennis T
http://lists.gnu.org/archive/html/bug-bison/2016-03/msg00002.html
- by Frank Heckenbach
https://savannah.gnu.org/patch/?9616
* data/c++.m4 (b4_inline): New: expands to `inline` or nothing.
Use it where appropriate.
* data/lalr1.cc: Use it where appropriate.
* tests/c++.at (Syntax error as exception): Put the scanner in another
compilation unit to exercise the constructor of syntax_error.
* data/lalr1.cc, doc/bison.texi, etc/bench.pl.in, examples/variant.yy,
* tests/actions.at, tests/atlocal.in, tests/c++.at, tests/headers.at,
* tests/local.at, tests/types.at:
Don't use std::endl, it flushes uselessly, and is considered bad
style.
Currently lalr1.cc makes an out-of-bound access when trying to read @1
in rules with an empty rhs (i.e., when there is no @1) that raises an
error (YYERROR).
glr.c already gracefully handles this by using @$ as initial location
for the errors. Let's do that in yacc.c and lalr1.cc.
* data/lalr1.cc, data/yacc.c: Use @$ to initialize the error location.
* tests/actions.at: Check that case.
This style appears to be more traditional, at least in C++.
For instance in the standard, [facets.examples].
There are occurrences using "unsigned int" too though.
* data/lalr1.cc, data/location.cc, data/stack.hh: here.