* data/lalr1.cc, doc/bison.texi, etc/bench.pl.in, examples/variant.yy,
* tests/actions.at, tests/atlocal.in, tests/c++.at, tests/headers.at,
* tests/local.at, tests/types.at:
Don't use std::endl, it flushes uselessly, and is considered bad
style.
Currently lalr1.cc makes an out-of-bound access when trying to read @1
in rules with an empty rhs (i.e., when there is no @1) that raises an
error (YYERROR).
glr.c already gracefully handles this by using @$ as initial location
for the errors. Let's do that in yacc.c and lalr1.cc.
* data/lalr1.cc, data/yacc.c: Use @$ to initialize the error location.
* tests/actions.at: Check that case.
This style appears to be more traditional, at least in C++.
For instance in the standard, [facets.examples].
There are occurrences using "unsigned int" too though.
* data/lalr1.cc, data/location.cc, data/stack.hh: here.
There are warnings (-Wextra) in generated C++ code:
ltlparse.cc: In member function 'ltlyy::parser::symbol_number_type
ltlyy::parser::by_state::type_get() const':
ltlparse.cc:452:33: warning: enumeral and non-enumeral type in
conditional expression
return state == empty_state ? empty_symbol : yystos_[state];
Reported by Alexandre Duret-Lutz.
It turns out that -Wall and -Wextra were disabled because of a stupid
typo.
* configure.ac: Fix the stupid typo.
* data/lalr1.cc, src/AnnotationList.c, src/InadequacyList.c,
* src/ielr.c, src/print.c, src/scan-code.l, src/symlist.c,
* src/symlist.h, src/symtab.c, src/tables.c, tests/actions.at,
* tests/calc.at, tests/cxx-type.at, tests/glr-regression.at,
* tests/named-refs.at, tests/torture.at:
Fix warnings, mostly issues about variables used only with assertions,
which are disabled with -DNDEBUG.
The previous patches ensure that symbols (symbol_type and
stack_symbol_type) can be empty, cleared, and their emptiness can be
checked. Therefore, yyempty, which codes whether yyla is empty or
not, is now useless.
In C skeletons (e.g., yacc.c), the fact that the lookahead is empty is
coded by "yychar = YYEMPTY", which is exactly what this patch
restores, since yychar/yytoken corresponds to yyla.type.
* data/lalr1.cc (yyempty): Remove.
Rather, depend on yyla.empty ().
During error recovery, when discarding the lookeahead, we don't
destroy it, which is caught by parse.assert assertions.
Reported by Antonio Silva Correia.
With an analysis and suggested patch from Michel d'Hooge.
<http://savannah.gnu.org/support/?108481>
* tests/c++.at (Variants): Strengthen the test to try syntax errors
with discarded lookahead.
The symbol destructor is currently the only means to clear a symbol.
Unfortunately during error recovery we might have to clear the
lookahead, which is a local variable (yyla) that has not yet reached
its end of scope.
Rather that duplicating the code to destroy a symbol, or rather than
destroying and recreating yyla, let's provide a means to clear a
symbol.
Reported by Antonio Silva Correia, with an analysis from Michel d'Hooge.
<http://savannah.gnu.org/support/?108481>
* data/c++.m4, data/lalr1.cc (basis_symbol::clear, by_state::clear)
(by_type::clear): New.
(basic_symbol::~basic_symbol): Use clear.
When variant are enabled, the yylhs variable (the left-hand side of
the rule being reduced, i.e. $$ and @$) is explicitly destroyed when
YYERROR is called. This is because before running the user code, $$
is initialized, so that the user can properly use it.
However, when quitting yyparse, yylhs is also reclaimed by the C++
compiler: the variable goes out of scope.
Instead of trying to be too smart, let the compiler do its job: reduce
the scope of yylhs to exactly the reduction. This way, whatever the
type of scope exit (regular, exception, return, goto...) this variable
will be properly reclaimed.
Reported by Paolo Simone Gasparello.
<http://lists.gnu.org/archive/html/bug-bison/2013-10/msg00003.html>
* data/lalr1.cc (yyparse): Reduce the scope of yylhs.
* tests/c++.at: We now pass this test.
* data/glr.c (yyLRgotoState): Name the symbol argument yysym, instead
of yylhs.
* data/lalr1.cc (yy_lr_goto_state_): Likewise.
* data/lalr1.java (yy_lr_goto_state_): New, modeled after the previous
two routines.
Use it.
Building C++ parsers with -Wsuggest-attribute=const and
-Wsuggest-attribute=noreturn triggers warning in generated code.
* data/lalr1.cc: Call b4_attribute_define.
(debug_stream, debug_level): Flag as pure.
* tests/headers.at (Several parsers): There are now more YY macros
that "leak".
* origin/maint:
glr.cc: fix a clang warning
maint: update copyright years
build: fix VPATH issue
build: avoid clang's colored diagnostics in the test suite
tests: please clang and use ".cc", not ".c", for C++ input
gnulib: update
skeletons: avoid empty switch constructs
lalr1.cc: fix compiler warnings
yacc.c: do not use __attribute__ unprotected
tests: style changes
* data/c.m4 (b4_symbol_type_register, b4_type_define_tag)
(b4_symbol_value_union, b4_value_type_setup_union)
(b4_value_type_setup_variant, b4_value_type_setup):
New.
(b4_value_type_define): Use it to set up properly the type.
Handle the various possible values of api.value.type.
* data/c++.m4 (b4_value_type_declare): Likewise.
* data/lalr1.cc (b4_value_type_setup_variant): Redefine.
* tests/types.at: New.
Exercise all the C/C++ skeletons with different types of
api.value.type values.
* tests/local.mk, tests/testsuite.at: Use it.
* doc/bison.texi (%define Summary): Document api.value.type.
* NEWS: Advertise it, together with api.token.constructor.
Reported by Rob Conde.
http://lists.gnu.org/archive/html/bug-bison/2013-03/msg00003.html
* data/c.m4 (b4_symbol_actions): Rename as...
(_b4_symbol_actions): this.
(b4_symbol_actions): New wrapper.
Do not emit empty switches.
Adjust all b4_symbol_actions callers.
Reported by Rob Conde.
http://lists.gnu.org/archive/html/bug-bison/2013-03/msg00003.html
* data/stack.hh (operator=, stack(const stack&)): Make this class
uncopyable, i.e., "undefine" these operators: make them private and
don't implement them.
(clear): New.
* data/lalr1.cc: Use it instead of an assignment.
(parser): Make this class uncopyable.
* data/c++.m4, data/lalr1.cc (by_state, by_type): Do not use -1 to
denote the absence of value, as GCC then fears that this -1 might
be used to dereference arrays (such as yytname).
Use 0, which corresponds to $accept, which is valueless (the needed
property: the symbol destructor must not try to reclaim the memory
associated with the symbol).
* data/c++.m4 (b4_public_types_declare): Declare token_number_type soon.
Introduce symbol_number_type (wider than token_number_type).
Clarify the requirement that kind_type from by_state and by_type
denote the _input_ type (required by the constructor), not the stored type.
Use symbol_number_type and token_number_type where appropriate, instead
of int.
* data/lalr1.cc: Adjust to these changes.
Propagate "symbol_number_type".
Invoke "type_get ()" instead of read "type" directly.
Many 'inline' keywords were in the declarations. They rather belong in
definitions, so move them.
* data/c++.m4 (basic_symbol, by_type): Many inlines here.
* data/lalr1.cc (yytranslate_, yy_destroy_, by_state, yypush_, yypop_): Inline
these as well.
(move): Move the definition outside the struct, where it belongs.
Now that symbols behaves properly, we can eliminate special routines
that are no longer needed.
* data/c++.m4, data/glr.cc, data/lalr1.cc, data/variant.hh:
Remove useless assignment operators and copy constructors.
As a consequence, remove useless includes for "abort".
The current approach was too adhoc: the symbols were not sufficiently
self-contained, in particular wrt memory management. The "new"
guideline is the one that should have been followed from the start:
let the symbols handle themslves, instead of leaving their users to
it. It was justified by the will to avoid gratuitious moves and
copies, but the current approach does not seem to be slower, yet it
will probably be simpler to adjust to support move semantics from
C++11.
The documentation says that the %parse-param are available from the
%destructor. In retrospect, that was a silly design decision, which
we can break for variants, as its a new feature. It should be phased
out for non-variants too.
* data/variant.hh: A variant never knows if it stores something or
not, it is up to its users to store this information.
Yet, in parse.assert mode, make sure the empty/filled variants
are properly used.
(b4_symbol_constructor_define_): Don't call directly the symbol
constructor, to save a useless temporary.
* data/stack.hh (push): Steal the pushed value instead of duplicating
it.
This will simplify the callers of push, who handled this "move"
approach themselves.
* data/c++.m4 (basic_symbol): Let -1, as kind, denote the fact that
a symbol is empty.
This is needed for instance when shifting the lookahead: yyla
is given as argument to "push", and its value is then moved on
the stack. But then yyla must be declared "empty" so that its
destructor won't be called.
(basic_symbol::move): New.
Move the responsibility of calling the destructor from yy_destroy
to ~basic_symbol in the case of variants.
* data/lalr1.cc (stack_symbol_type): Now a derived class from its
previous value, so that we can add a constructor from a symbol_type.
(by_state): State -1 means empty.
(yypush_): Factor, by calling one overload from the other one, and
using the new semantics of stack::push.
No longer reclaim by hand the memory from rhs symbols, since now
that we store objects with proper destructors, they will be reclaimed
automatically.
Conversely, be sure to delete yylhs.
* tests/c++.at (C++ Variant-based Symbols): New "unit" test for
symbols.
* data/variant.hh (variant, operator=): Make private.
* data/c++.m4 (operator=): New, to avoid needing a definition of that operator
for each class member (such as a possible variant).
* data/glr.cc, data/lalr.cc: Add the necessary include for the abort.
A "symbol" groups together the symbol type (INT, PLUS, etc.), its
possible semantic value, and its optional location. The type is
needed to access the value, as it is stored as a variant/union.
There are two kinds of symbols. "symbol_type" are "external symbols":
they have type, value and location, and are returned by yylex.
"stack_symbol_type" are "internal symbols", they group state number,
value and location, and are stored in the parser stack. The type of
the symbol is computed from the state number.
The class template symbol_base_type<Exact> factors the code common to
stack_symbol_type and symbol_type. It uses the Curiously Recurring
Template pattern so that we can always (static_) downcast to the exact
type. symbol_base_type features value and location, and delegates the
handling of the type to its parameter.
When trying to generalize the support for variant, a significant issue
was revealed: because stack_symbol_type and symbol_type _derive_ from
symbol_base_type, the type/state member is defined _after_ the value
and location. In C++ the order of the definition of the members
defines the order in which they are initialized, things go backward:
the value is initialized _before_ the type. This is wrong, since the
type is needed to access the value.
Therefore, we need another means to factor the common code, one that
ensures the order of the members.
The idea is simple: define two (base) classes that code the symbol
type ("by_type" codes it by its type, and "by_state" by the state
number). Define basic_symbol<Base> as the class template that
provides value and location support. Make it _derive_ from its
parameter, by_type or by_state. Then define stack_symbol_type and
symbol_type as basic_symbol<by_state>, basic_symbol<by_type>. The
name basic_symbol was chosen by similarity with basic_string and
basic_ostream.
* data/c++.m4 (symbol_base_type<Exact>): Remove, replace by...
(basic_symbol<Base>): which derives from its parameter, one of...
(by_state, by_type): which provide means to retrieve the actual type of
symbol.
(symbol_type): Is now basic_symbol<by_type>.
(stack_symbol_type): Is now basic_symbol<by_state>.
* data/lalr1.cc: Many adjustments.
The commit 38de4e570f underquoted the
content of the comments, which resulted in losing square brackets in
the comments. Besides, some other invocations were underquoting the
effective arguments.
* data/c.m4 (b4_comment_): Properly quote the comment.
(b4_comment_, b4_comment): Move to...
* data/c-like.m4: here, so that...
* data/java.m4: can use it instead of its own copy.
* data/bison.m4 (b4_integral_parser_tables_map): Fix some comments.
* data/lalr1.cc, data/lalr1.java, data/yacc.c: Comment fixes.
* data/lalr1.cc: Reorder a bit to factor some CPP directives.
The YYLEX existed only to support YYLEX_PARAM, which is now removed.
This macro was a nuisance, since incorrect yylex calls where pointed
the macro _use_, instead of its definition.
* data/c.m4 (b4_lex_formals, b4_lex): New.
* data/glr.c, data/yacc.c: Use it.
* data/lalr1.cc (b4_lex): New.
Use it.
squash! skeletons: no longer call yylex via a CPP macro
The "variant" structure provides a means to store, in a typeless way,
C++ objects. Manipulating it without provide the type of the stored
content is doomed to failure. So provide a means to copy in a type
safe way, and prohibit typeless assignments.
* data/c++.m4 (symbol_type::move): New.
* data/lalr1.cc: Use it.
* data/variant.hh (b4_variant_define): Provide variant::copy.
Let variant::operator= abort.
We cannot undefine it, yet, as it is still uses by the implicit
assigment in symbols, which must also be disabled.
Equip variants with more checking code. Provide a means to request
includes.
* data/variant.hh (b4_variant_includes): New.
* data/lalr1.cc: Use it.
* data/variant.hh (variant::built): Define at the end, as a private member.
(variant::tname): New.
Somewhat makes "built" useless, but let's keep both for a start, in
case using "typeinfo" is considered unacceptable in some environments.
Fix some formatting issues.
* origin/maint:
parser: accept #line NUM
m4: use a safer pattern to enable/disable output
tests: beware of gnulib's need for config.h
gnulib: update
yacc.c, glr.c: check and fix the display of locations
formatting changes
glr.c: remove stray macro
Conflicts:
data/c.m4
data/glr.cc
data/lalr1.cc
data/lalr1.java
data/location.cc
data/stack.hh
data/yacc.c
src/scan-gram.l
Work on some other areas of Bison revealed that some macros expanded
to be expanded only once were actually expanded several times. This
was due to the fact that changecom was not properly restored each
time, and macro names appearing in comments were then expanded.
Introduce begin/end macros which are easier to match that
changecom()/changecom(#).
* data/bison.m4 (b4_output_begin, b4_output_end): New.
* data/glr.c, data/glr.cc, data/lalr1.cc, data/lalr1.java,
* data/location.cc, data/stack.hh, data/yacc.c:
Use them.