* data/skeletons/d.m4 (b4_public_types_declare): Here.
* data/skeletons/lalr1.d: Adjust.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y: Use it.
* tests/calc.at: Test it.
The yychar variable was keeping the external form of the token (the
TokenKind). As the D parser translates the token to its internal
form (the SymbolKind) inside the struct Symbol, there is no need for
yychar anymore.
* data/examples/lalr1.d (yychar): Remove.
Use only yytoken.
* examples/d/calc/calc.y (start, end): Replace by this...
(location): new member variable in the Lexer class.
Use it.
* tests/calc.at: Use the defined location variable.
examples/c++/glr/c++-types.cc:721:24: error:
expected the class name after '~' to name a destructor
yysval.YYSTYPE::~semantic_type ();
^
Using a local typedef, for some reaon, result in clang complaining
about a useless local typedef. Since anyway we don't want to keep on
using YYSTYPE and YYLTYPE, it is time to introduce proper typedefs to
reach these guys. And to be slightly in advance of the other
skeletons: use value_type, not semantic_type. This is much more
consistent with our use of the (kind, value, location) triplet.
* data/skeletons/glr2.cc (glr_state::value_type)
(glr_state::location_type): New.
(glr_state::~glr_state): Use value_type to name the dtor.
Currently we are using pointers. The whole point of
glr2.cc (vs. glr.cc) is precisely to allow genuine C++ objects to be
semantic values. Let's make that work.
* data/skeletons/glr2.cc (glr_state::glr_state): Be sure to initialize
yysval.
(glr_state): Add copy-ctor, assignment and dtor.
(glr_state::copyFrom): Be sure to initialize the destination if it was
not.
(glr_state::~glr_state): Destroy the semantic value.
* examples/c++/glr/ast.hh: Rewrite so that we use genuine objects,
rather than a traditional OOP hierarchy that requires to deal with
pointers.
With help from Bruno Belanyi <bruno.belanyi@epita.fr>.
* examples/c++/glr/c++-types.yy: Remove memory management.
Use true objects.
(main): Don't reach yydebug directly.
* examples/c++/glr/local.mk: We need C++11.
When expanding the GLR stack, none of the pointers were updated to
reflect the new location of the displaced objects.
This fixes
748: Incorrect lookahead during nondeterministic GLR: glr2.cc
* data/skeletons/glr2.cc (yyexpandGLRStack): Update the split point
and the stack tops.
(reduceToOneStack): Factor a bit.
This test fails:
748: Incorrect lookahead during nondeterministic GLR: glr2.cc
It consumes lots of stack space, so at some point we need to expand
it. Because of Boolean logic mistakes, we then claim
memory-exhausted (first error). Hence we jump to cleaning the
stack (popall_), calling all the destructors, and at some point we
crash with heap-use-after-free (second error).
This commit fixes the first error. Unfortunately, even though we now
do expand the stack, we crash again with (another)
heap-use-after-free, not addressed here.
Eventually, we should make sure popall_() properly works.
* data/skeletons/glr2.cc (yyexpandGLRStackIfNeeded): Return true iff
success (i.e., memory not exhausted).
From a debugger, it is easier to pass a file name than working on
stdin.
* examples/c++/glr/c++-types.yy: Reduce scopes.
Avoid YYSTYPE/YYLTYPE: use the C++ types.
(input, process): New.
(main): Use them.
And also, remove the incorrect indentation of these comments:
- /* YYR2[YYN] -- Number of symbols on the right hand side of rule YYN. */
+/* YYR2[RULE-NUM] -- Number of symbols on the right-hand side of rule RULE-NUM. */
static const yytype_int8 yyr2[] =
{
0, 2, 4, 0, 2, 1, 1, 1, 3, 2,
I don't remember why this indentation was added (in
0991e29b75), but it seems wrong,
at least for yacc.c. I suspect this was done with lalr1.cc (where
this is embeded in the class definition, so it should be indented),
but today lalr1.cc uses other routines to output these comments.
* data/skeletons/bison.m4 (b4_integral_parser_tables_map): Improve the
wording of the comments of some tables.
* data/skeletons/c.m4 (b4_integral_parser_table_define): Remove
indentation.
A glr_stack_item has "raw" memory to store either a glr_state or a
semantic_option. glr_stack_item::setState stores a state using a copy
assignment. However, this is more like a construction: we are
starting from "raw" memory, so use the placement new operator instead.
While it probably makes no difference when parse.assert is disabled,
it does make one when it is: the constructor properly initialize the
magic number, the assignment does not. So without these changes, the
next commit (which stores genuine objects in semantic values) fails
tests 712 and 730 because of incorrect magic numbers.
* data/skeletons/glr2.cc (glr_stack_item::setState): Build the state,
don't just copy it.
ast.hh:24:7: error: 'Node' has no out-of-line virtual method definitions; its vtable will be emitted in every translation unit [-Werror,-Wweak-vtables]
class Node
^
ast.hh:57:7: error: 'Nterm' has no out-of-line virtual method definitions; its vtable will be emitted in every translation unit [-Werror,-Wweak-vtables]
class Nterm : public Node
^
ast.hh:102:7: error: 'Term' has no out-of-line virtual method definitions; its vtable will be emitted in every translation unit [-Werror,-Wweak-vtables]
class Term : public Node
^
* examples/c++/glr/ast.hh: Define the destructors out of the class
definition.
This does not change anything, it is still in the header, but that
does pacify clang.
When debugging these parsers, we really need debug traces.
Enable them, and bind them to $YYDEBUG.
* tests/glr-regression.at: Support the YYDEBUG envvar.
As a consequence, now that syntactic ambiguities are reported, adjust
the expected output.
(No users destructors if stack 0 deleted): Don't return 0 on memory
exhaustion, really return the parser's status, and adust expectations.
Currently the example really looks like C. Instead of a union of
structs to implement the AST, use a hierarchy. It would be nice to
feature a C++17 version with std variants.
* examples/c++/glr/c++-types.yy (Node, free_node, new_nterm)
(new_term): Move into...
* examples/c++/glr/ast.hh: here, a proper C++ hierarchy.
Amusingly enough, glr2.cc still had its core function, yyparse, being
a free function instead of a member function.
* data/skeletons/glr2.cc (yyparse): Remove this free function called
from yyparser::parse. Inline its body into...
(yyparser::parse): this member function.
This requires moving a bit the yychar, etc. macros.
Access to token can be simplified (the
b4_namespace_ref::b4_parser_class prefix is no longer needed).
Remove the useless conditional b4_pure_if: the skeleton is, of course,
pure (no global variables). Make glr2.cc intrinsically pure.
* data/skeletons/glr2.cc (b4_pure_if): Remove definition and uses.
(b4_lex): New.
Stolen from lalr1.cc to avoid needing to use the one from c.m4.
Currently, yycompressStack expects the free items to be states only.
That's not the case.
Fixes 712 and 730 pass. 748 still fails, but later and
differently (heap-use-after-free).
* data/skeletons/glr2.cc (glr_stack_item::setState): New.
(glr_stack_item::yycompressStack): Use it.
* tests/glr-regression.at: Adjust.
A glr_state keeps tracks of its predecessor using an offset relative
to itself (i.e., pointer subtraction). Unfortunately we sometimes
have to compute offsets for pointers that live in different
containers, in particular in yyfillin. In that case there is no
reason for the distance between the two objects to be a multiple of
the object size (0x40 on my machine), and the resulting ptrdiff_t may
be "wrong", i.e., it does allow to recover one from the other. We
cannot use "typed" pointer arithmetics here, the Euclidean division
has it wrong. So use "plain" char* pointers.
Fixes 718 (Duplicate representation of merged trees: glr2.cc) and
examples/c++/glr/c++-types.
Still XFAIL:
712: Improper handling of embedded actions and dollar(-N) in GLR parsers: glr2.cc
730: Incorrectly initialized location for empty right-hand side in GLR: glr2.cc
748: Incorrect lookahead during nondeterministic GLR: glr2.cc
* data/skeletons/glr2.cc (glr_state::as_pointer_): New.
(glr_state::pred): Use it.
* examples/c++/glr/c++-types.test: The test passes.
* tests/glr-regression.at (Duplicate representation of merged trees:
glr2.cc): Passes.
The use of YY_IGNORE_NULL_DEREFERENCE_BEGIN/END in `check_` is to
please GCC 10:
glr-regr8.cc: In member function 'YYRESULTTAG glr_stack::yyresolveValue(glr_state&)':
glr-regr8.cc:1433:21: error: potential null pointer dereference [-Werror=null-dereference]
1433 | YYASSERT (this->magic_ == MAGIC);
| ~~~~~~^~~~~~
glr-regr8.cc:905:40: note: in definition of macro 'YYASSERT'
905 | # define YYASSERT(Condition) ((void) ((Condition) || (abort (), 0)))
| ^~~~~~~~~
* data/skeletons/glr2.cc (glr_state::check_): New.
Use it in the member functions.
We obviously have broken pointer arithmetics that hands us
glr_stack_items that are not glr_stack_items. Have a simple check for
this, to have earlier failures.
* data/skeletons/glr2.cc (glr_stack_item::check_): New.
Use it.
(glr_stack_item::contents): Avoid the useless struct.
Fix minor stylistic issues.
When installed on master as of 2020-12-05 (on top of "glr2.cc: fix
when the stack is not expandable", almost all the GLR regression tests
fail (with a SEGV):
709: Badly Collapsed GLR States: glr2.cc FAILED (glr-regression.at:130)
712: Improper handling of embedded actions and dollar(-N) in GLR parsers: glr2.cc FAILED (glr-regression.at:275)
715: Improper merging of GLR delayed action sets: glr2.cc FAILED (glr-regression.at:404)
718: Duplicate representation of merged trees: glr2.cc FAILED (glr-regression.at:502)
721: User destructor for unresolved GLR semantic value: glr2.cc FAILED (glr-regression.at:566)
724: User destructor after an error during a split parse: glr2.cc FAILED (glr-regression.at:624)
727: Duplicated user destructor for lookahead: glr2.cc FAILED (glr-regression.at:724)
730: Incorrectly initialized location for empty right-hand side in GLR: glr2.cc FAILED (glr-regression.at:823)
733: No users destructors if stack 0 deleted: glr2.cc FAILED (glr-regression.at:911)
736: Corrupted semantic options if user action cuts parse: glr2.cc FAILED (glr-regression.at:974)
739: Undesirable destructors if user action cuts parse: glr2.cc FAILED (glr-regression.at:1042)
742: Leaked semantic values if user action cuts parse: glr2.cc FAILED (glr-regression.at:1173)
748: Incorrect lookahead during nondeterministic GLR: glr2.cc FAILED (glr-regression.at:1546)
751: Leaked semantic values when reporting ambiguity: glr2.cc FAILED (glr-regression.at:1639)
754: Leaked lookahead after nondeterministic parse syntax error: glr2.cc FAILED (glr-regression.at:1710)
757: Uninitialized location when reporting ambiguity: glr2.cc FAILED (glr-regression.at:1794)
766: Predicates: glr2.cc FAILED (glr-regression.at:2045)
These pass:
745: Incorrect lookahead during deterministic GLR: glr2.cc ok
760: Missed %merge type warnings when LHS type is declared later: glr2.cc ok
763: Ambiguity reports: glr2.cc ok
With Valentin Tolmer's "glr2.cc: Fix memory corruption bug" commit,
these test fail "gracefully":
712: Improper handling of embedded actions and dollar(-N) in GLR parsers: glr2.cc FAILED (glr-regression.at:268)
730: Incorrectly initialized location for empty right-hand side in GLR: glr2.cc FAILED (glr-regression.at:816)
748: Incorrect lookahead during nondeterministic GLR: glr2.cc FAILED (glr-regression.at:1539)
And these do not end:
709: Badly Collapsed GLR States: glr2.cc FAILED (glr-regression.at:123)
715: Improper merging of GLR delayed action sets: glr2.cc FAILED (glr-regression.at:397)
718: Duplicate representation of merged trees: glr2.cc FAILED (glr-regression.at:495)
751: Leaked semantic values when reporting ambiguity: glr2.cc FAILED (glr-regression.at:1632)
With "tests: glr2.cc: run the glr-regression tests", none loop, and
709, 715, and 751 pass. Only 718 still fails.
* tests/glr-regression.at: Run all the tests with glr2.cc.
* tests/local.at (AT_GLR2_CC_IF): New.
When "tests: glr2.cc: run the glr-regression tests" tests are run,
before this commit the following tests used to loop endlessly:
709: Badly Collapsed GLR States: glr2.cc FAILED (glr-regression.at:123)
715: Improper merging of GLR delayed action sets: glr2.cc FAILED (glr-regression.at:397)
718: Duplicate representation of merged trees: glr2.cc FAILED (glr-regression.at:495)
751: Leaked semantic values when reporting ambiguity: glr2.cc FAILED (glr-regression.at:1632)
After this commit, no test loops and 709, 715, and 751 pass. Only 718
still fails.
* data/skeletons/glr2.cc (yyresolveValue): Add missing incrementation
of the iteration variable.
* data/skeletons/glr2.cc: Use 'const' on variables and applicable
member functions.
Improve comments.
Use references where applicable.
Enforce names_like_this, notLikeThis.
Reduce scopes.
* tests/glr-regression.at: Adjust the tests to be more independent of
the language, and run them with glr.cc.
Some tests relied on yychar, yylval and yylloc being global variables:
pass arguments instead.
* tests/glr-regression.at: Instead of using AT_BISON_CHECK and
AT_COMPILE, use AT_FULL_COMPILE. This is shorter, and makes it easier
to add support for other programming languages.
* tests/glr-regression.at: Use %expect and %expect-rr in the grammar
files, rather than accepting diagnostics.
This will make it easier to support other programming languages.
When using glr.cc, the C function yyparse is an internal detail that
should not be exposed. Users might call it by accident (I did).
* data/skeletons/glr.c (yyparse): When used for glr.cc, rename as yy_parse_impl.
* data/skeletons/glr.cc: Adjust.