Commit Graph

7709 Commits

Author SHA1 Message Date
Akim Demaille
e51e89856a glr2.cc: add support for variants
(Bison) Variants are extremely picky, which makes them both
annoying (lots of micro-details must be taken care of) and
precious (all the micro-details must be taken care of, in particular
object lifetime).

So (i) each time a semantic value is stored, it must be stored in a
place that exists, and (ii) each time a semantic value is discarded,
its place must have been emptied.

Example of (i)

    - new (&yys.value ()) value_type (s->value ());
    + {]b4_variant_if([[
    +   new (&yys.value ()) value_type ();
    +   ]b4_symbol_variant([yy_accessing_symbol (s->yylrState)],
    +                      [yys.value ()], [copy], [s->value ()])], [[
    +   new (&yys.value ()) value_type (s->value ());]])[
    + }

Example of (ii)

      yyparser.yy_destroy_ ("Error: discarding",
    -                       yytoken, &yylval]b4_locations_if([, &yylloc])[);
    +                       yytoken, &yylval]b4_locations_if([, &yylloc])[);]b4_variant_if([[
    + // Value type destructor.
    + ]b4_symbol_variant([[YYTRANSLATE (this->yychar)]], [[yylval]], [[template destroy]])])[
      this->yychar = ]b4_namespace_ref[::]b4_parser_class[::token::]b4_symbol(empty, id)[;

However, in some places we must not be "pure".  In particular:

    glr_stack_item (const glr_stack_item& other) YY_NOEXCEPT YY_NOTHROW
      : is_state_ (other.is_state_)
    {
      std::memcpy (raw_, other.raw_, union_size);
    }

still must use memcpy, because the constructor would change pred, and
it must not.  This constructor is used only when resizing the stack,
in which case pred (which is relative) must not be "adjusted".

The result works, but is messy.  Its verbosity comes from at least two
factors:

- we don't have support for complete symbols (binding kind, value and
  location), and we should at least try to have it.  That simplified
  lalr1.cc a lot.

- I have not tried to be smart and use 'move' when possible.  As a
  consequence many places have 'copy' and then 'destroy'.  That kind
  of clean up can be done once everything appears to be solid.

* data/skeletons/glr2.cc: Be more rigorous in object lifetime.
In particular, don't forget to discard the lookahead when we're done
with it.
Call variant routines where needed.
Deal with plenty of details.
(b4_call_merger): Add support for variants.
Use references in mergers, rather than pointers.

* examples/c++/glr/c++-types.yy: Exercise variants.
2021-01-05 09:28:20 +01:00
Akim Demaille
c2a06bf791 glr: strengthen the tests
On some experimentation I was running, the test suite was passing, yet
the example crashed when run in verbose mode.  Let's add this case to
the test suite.

* tests/cxx-type.at: Run all these tests in verbose mode too.
2021-01-05 07:23:44 +01:00
Akim Demaille
8733959954 c++: I'm tired of Flex's warnings
* doc/bison.texi: Disable another warning I'm tired to see.
New releases would be most welcome.
2021-01-03 20:11:48 +01:00
Akim Demaille
31b8b8f179 glr: example: flush the output
* examples/c/glr/c++-types.y: Flush stdout so that the logs (on
stderr) and the effective output (on stdout) mix correctly.
While at it, be a bit more const-correct.
2021-01-03 19:58:23 +01:00
Akim Demaille
e2199d0fb2 style: YYUSE is private, make it YY_USE
This macro is not exposed to users, make start it with 'YY_'.

* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/glr2.cc, data/skeletons/lalr1.cc,
* src/parse-gram.c, tests/actions.at, tests/c++.at, tests/headers.at,
* tests/local.at (YYUSE): Rename as...
(YY_USE): this.
2021-01-03 19:57:10 +01:00
Akim Demaille
c1884d3002 glr.c: example: use a printer
* examples/c++/glr/c++-types.yy: Here.
2021-01-03 08:14:22 +01:00
Akim Demaille
ab1208e263 glr: consistently use the same wording in traces
* data/skeletons/glr.c, data/skeletons/glr2.cc (yyglrReduce): Traces
refer to "state 42", not to "state #42".
2021-01-03 08:14:22 +01:00
Akim Demaille
74d1e881a7 glr2.cc: also equip semantic_option with self check
* data/skeletons/glr2.cc (semantic_option): Add MAGIC_, magic_ and
check_ members.
Use it.
2021-01-02 13:02:34 +01:00
Akim Demaille
f30067ed51 glr2.cc: log the execution of deferred actions
See "glr.c: log the execution of deferred actions".

* data/skeletons/glr2.cc (yyuserAction): Take yyk as a new argument.
Rename argument yyn as yyrule for clarity.
Log before and after the user action.
Adjust callers to not call YY_REDUCE_PRINT and YY_SYMBOL_PRINT.
2021-01-02 13:02:19 +01:00
Akim Demaille
630448ba6b glr2.cc: minor clean up
* data/skeletons/glr2.cc (YYUNDEFTOK): Now useless.
Formatting/coding style changes.
2021-01-02 08:28:26 +01:00
Akim Demaille
1283dc7243 glr.c: log the execution of deferred actions
Currently deferred reductions are not "verbose" at all: only immediate
reductions are displayed in the YYDEBUG traces.  I don't understand
why.  Besides it seems actually simpler the install the reduction
traces right around the user action inside yyuserAction rather that
around calls to yyuserAction.

This only trouble is that yyuserAction does not know the stack number
it works on, so we have to pass it.  And pass -1 when we are actually
running on a temporary stack.

The glr example, on "T(x) + y;" as input, adds these logs, which
allow to see when the `<cast>` is built:

     Stack 0 Entering state 26
     Reduced stack 0 by rule 7 (line 108); action deferred.  Now in state 7.
     Stack 0 Entering state 7
     Reading a token
     Next token is token '+' (1.6: )
     Stack 1 Entering state 27
     Reduced stack 1 by rule 13 (line 123); action deferred.  Now in state 12.
     Stack 1 Entering state 12
     Next token is token '+' (1.6: )
     Stack 1 dies.
     Removing dead stacks.
     On stack 0, shifting token '+' (1.6: )
     Stack 0 now in state #14
    +Reducing stack -1 by rule 6 (line 107):
    +   $1 = token identifier (1.3: x)
    +-> $$ = nterm expr (1.3: x)
    +Reducing stack -1 by rule 7 (line 108):
    +   $1 = token typename (1.0: T)
    +   $2 = token '(' (1.2: )
    +   $3 = nterm expr (1.3: x)
    +   $4 = token ')' (1.4: )
    +-> $$ = nterm expr (1.0-3: <cast>(x,T))
     Returning to deterministic operation.

* data/skeletons/glr.c (yyuserAction): Take yyk as a new argument.
Rename argument yyn as yyrule for clarity.
Log before and after the user action.
Adjust callers to not call YY_REDUCE_PRINT and YY_SYMBOL_PRINT.
2021-01-02 07:37:00 +01:00
Akim Demaille
e3d4b42f58 glr.c: reorder routines
The next commit wants to use YY_REDUCE_PRINT above its current
definition.  Move it higher.

* data/skeletons/glr.c (yylhsNonterm, YY_REDUCE_PRINT): Make available
earlier.
2021-01-02 07:37:00 +01:00
Akim Demaille
f67c1ce937 glr.c: example: use the exact same display as in the C++ example
* examples/c/glr/c++-types.y: Add a space after the commas.
* examples/c/glr/c++-types.test: Adjust expectations.
2021-01-02 07:37:00 +01:00
Akim Demaille
5a4e606275 glr.c: example: several improvements
* examples/c/glr/c++-types.y (node_print): New.
Use YY_LOCATION_PRINT instead of duplicating it.
And actually use it in the action instead of badly duplicating it.
(main): Add proper option support.
* examples/c/glr/c++-types.test: Adjust expectations on locations.
* examples/c++/glr/c++-types.yy: Fix bad iteration.
2021-01-02 07:36:35 +01:00
Akim Demaille
83f2eb3737 glr2.cc: the example requires Bison 3.8
This will save us from generating the position.hh file.

* src/parse-gram.y: Claim we are 3.8.
* examples/c++/glr/c++-types.yy: Require 3.8.
2020-12-31 08:21:25 +01:00
Akim Demaille
3911aba39a %merge: associate it to its first definition, not the latest
Currently each time we meet %merge we record this location as the
defining location (and symbol).  Instead, record the first definition.

In the generated code we go from

    yy0->A = merge (*yy0, *yy1);

to

    yy0->S = merge (*yy0, *yy1);

where S was indeed the first symbol, and in the diagnostics we go from

    glr-regr18.y:30.18-24: error: result type clash on merge function 'merge': <type2> != <type1>
       30 | sym2: sym3 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:29.18-24: note: previous declaration
       29 | sym1: sym2 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:31.13-19: error: result type clash on merge function 'merge': <type3> != <type2>
       31 | sym3: %merge<merge> { $$ = 0; } ;
          |             ^~~~~~~
    glr-regr18.y:30.18-24: note: previous declaration
       30 | sym2: sym3 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~

to

    glr-regr18.y:30.18-24: error: result type clash on merge function 'merge': <type2> != <type1>
       30 | sym2: sym3 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:29.18-24: note: previous declaration
       29 | sym1: sym2 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~
    glr-regr18.y:31.13-19: error: result type clash on merge function 'merge': <type3> != <type1>
       31 | sym3: %merge<merge> { $$ = 0; } ;
          |             ^~~~~~~
    glr-regr18.y:29.18-24: note: previous declaration
       29 | sym1: sym2 %merge<merge> { $$ = $1; } ;
          |                  ^~~~~~~

where both duplicates are reported against definition 1, rather than
using definition 1 as a reference when diagnosing about definition 2,
and then 2 as a reference for 3.

* src/reader.c (record_merge_function_type): Keep the first definition.
* tests/glr-regression.at: Adjust.
2020-12-31 08:07:34 +01:00
Akim Demaille
8bc45673d5 %merge: test support for api.value.type=union
* tests/glr-regression.at: here.
2020-12-31 08:07:34 +01:00
Akim Demaille
fbe5abd23d %merge: fix compatibility with api.value.type=union
Reported by Jot Dot.
https://lists.gnu.org/r/help-bison/2020-12/msg00014.html

* data/skeletons/glr.c, data/skeletons/glr2.cc (b4_call_merger): Use
the symbol's slot, not its type.
* examples/c/glr/c++-types.y: Use explicit per-symbol typing together
with api.value.type=union.
(yylex): Use yytoken_kind_t.
2020-12-31 08:07:25 +01:00
Akim Demaille
c09f2e4c7b %merge: delegate the generation of calls to mergers to m4
Don't generate C code from bison, leave that to the skeletons.

* src/output.c (merger_output): Emit invocations to b4_call_merger.
* data/skeletons/glr.c, data/skeletons/glr2.cc (b4_call_merger): New.
2020-12-31 08:07:11 +01:00
Akim Demaille
ac3d5b76f7 %merge: let mergers record a typing-symbol, rather than a type
Symbols are richer than types, and in M4 it is my simpler (and more
common) to deal with symbols rather than types.  So let's associate
mergers to a symbol rather than a type name.

* src/reader.h (merger_list): Replace the 'type' member by a symbol
member.
* src/reader.c (record_merge_function_type): Take a symbol as
argument, rather than a type name.
* src/output.c (merger_output): Adjust.
2020-12-31 08:07:11 +01:00
Akim Demaille
edfcca8481 %merge: clearer tests on diagnostics
* tests/glr-regression.at: Use caret errors.
2020-12-31 08:07:11 +01:00
Akim Demaille
92e943bd24 glr2.cc: style: quoting changes
* data/skeletons/glr2.cc: Use stricter quoting rules.
2020-12-28 08:20:22 +01:00
Akim Demaille
e40da7b6b3 lalr1.cc: style: quoting changes
* data/skeletons/lalr1.cc: here.
2020-12-28 08:20:16 +01:00
Akim Demaille
70b3c8fb20 glr2.cc: use references to the stack rather than pointers
Now that the lookahead macros (that used yystackp) are out of the way,
there is no reason to continue using a pointer.

* data/skeletons/glr2.cc: Use yystack, a reference, rather that
yystackp, a pointer.
Fix tons of const-correctness issues.
2020-12-27 08:50:10 +01:00
Akim Demaille
e9b7641cca glr2.cc: simplify names
Now that we no longer play dangerous games with macros, we can give
the lookahead's token kind its proper name.  The content of yychar
_is_ raw (as opposed to yytoken), there's no reason to pleonasmicate
it (and thus to neologize).

* data/skeletons/glr2.cc (glr_stack::yyrawchar): Rename as...
(glr_stack::yychar): this.
2020-12-27 08:36:20 +01:00
Akim Demaille
321fac2193 glr2.cc: get rid of the macros wrapping the lookahead
In glr.c, the macros yychar, yylval and yylloc allow to deal with
api.pure: sometimes they point to global variables (impure), sometimes
they point to the member variables (pure).

There's no room for globals in glr2.cc.  Besides, they map yychar to
yyrawchar, yylval to yyval, etc. which obfuscates what is actually
going on.

* data/skeletons/glr2.cc (glr_stack::yyval, glr_stack::yyloc): Rename
as...
(glr_stack::yylval, glr_stack::yylloc): these, for clarity.
(yynerrs, yychar, yylval, yylloc, yystackp): Remove these macros.
(b4_yygetToken_call): Remove.
2020-12-27 08:34:49 +01:00
Akim Demaille
2777b73166 glr2.cc: reorganize the skeleton
Restore a more natural order: first define the macros and then use
them.  Currently, some macros were defined between the moment the
header is issued, and then the implementation file.  As a result, it
was possible for the header and the implementation to not use the same
versions of the macros.

* data/skeletons/glr2.cc: Define the macros first, then use them.
* data/skeletons/lalr1.cc: Minor comment and quoting changes.
2020-12-26 18:06:02 +01:00
Akim Demaille
89296e3962 glr2.cc: example: simplify
* examples/c++/glr/c++-types.yy: Formatting changes.
Remove unused support for '@'.
* examples/c/glr/c++-types.y: Ditto.
2020-12-26 17:40:18 +01:00
Akim Demaille
bb97a2a37b glr2.cc: make yyreportTree a member function of semantic_option
* data/skeletons/glr2.cc (yy_accessing_symbol, yylhsNonterm): Define
ealier.
(state_stack::yyreportTree): Move to...
(semantic_option::yyreportTree): here.
Adjust dependencies.
2020-12-26 15:40:36 +01:00
Akim Demaille
8a22b557b9 glr2.cc: pass references to yyreportAmbiguity
* data/skeletons/glr2.cc (yyreportAmbiguity): Use references.
2020-12-26 15:40:33 +01:00
Akim Demaille
92dc8bf23b style: use yyval only, not yysval
* data/skeletons/glr.c, data/skeletons/glr2.cc: Use yyval, as in
the other skeletons.
2020-12-26 14:26:23 +01:00
Akim Demaille
c18dbfcb06 glr2.cc: pass location by const ref to yyglrShift
* data/skeletons/glr2.cc (glr_state.yyglrShift): Take the location by
const&.
Remove useless `inline`.
2020-12-26 14:26:23 +01:00
Akim Demaille
94701b4e5e style: rename semanticVal as value
* data/skeletons/README-D.txt: Remove, now useless and obsolete.
* data/skeletons/glr2.cc, examples/d/calc/calc.y,
* tests/calc.at, tests/d.at, tests/scanner.at (semanticVal): Replace
with...
(value): this.
2020-12-26 14:26:23 +01:00
Akim Demaille
3e6826aff1 glr2.cc: remove dead comments
* data/skeletons/glr2.cc: We no longer wrap glr.c here.
2020-12-26 14:26:23 +01:00
Akim Demaille
9466c734c5 glr2.cc: use YYCDEBUG, not YY_DEBUG_STREAM
* data/skeletons/glr2.cc (YY_DEBUG_STREAM): Rename as...
(YYCDEBUG): this, as in lalr1.cc.
2020-12-26 14:26:22 +01:00
Akim Demaille
b9a533d63e glr2.cc: formatting changes
* data/skeletons/glr2.cc: here.
Remove useless `inline`.
2020-12-26 14:26:22 +01:00
Akim Demaille
d0e44162b5 glr2.cc: don't use YYSTYPE/YYLTYPE at all
* data/skeletons/glr2.cc: Define value_type and location_type where
needed, and use them only.
(yyuserMerge): Make it a member function of the glr_state class.
2020-12-26 11:55:01 +01:00
Akim Demaille
8db99c54f4 tests: don't require YYSTYPE/YYLTYPE to be defined in C++
* tests/glr-regression.at: Use AT_YYSTYPE/AT_YYLTYPE to generate
yy::parser::value_type and yy::parser::location_type in C++.
2020-12-26 11:55:01 +01:00
Akim Demaille
2157ced3dd c++: rename semantic_type as value_type
We always refer to the triplet "kind, value, location".  All of them
are nouns, and we support api.value.type and api.location.type.  On
this regard, "semantic_type" was a poor choice.  Make it "value_type".

The test suite was not updated to use value_type, on purpose, to
enforce backward compatibility.

* data/skeletons/c++.m4, data/skeletons/glr.cc, data/skeletons/glr2.cc,
* data/skeletons/variant.hh, doc/bison.texi: Define value_type rather
than semantic_type.
Add a backward compatibility typedef.
* examples/c++/glr/c++-types.yy: Migrate.
2020-12-26 09:05:45 +01:00
Akim Demaille
59653c8efd doc: more about sanitizers
* README-hacking.md: here.
2020-12-26 08:08:06 +01:00
Akim Demaille
2a07cb0f2d glr2.cc: simplify
* data/skeletons/glr2.cc (glr_state_set::yyremoveDeletes): Use
vector::resize rather than vector::erase.
(glr_state::copyFrom): Merge into...
(glr_state::operator=): here.
Valentin wanted each assignment to be explicit, hence copyFrom rather
that operator=.  But in 0a82316e54
(glr2.cc: example: use objects (not pointers) to represent the AST),
in order to get real objects to be processed correctly, we had to
introduce the assignment operator.  Afterward, we also introduced a
full implementation of the copy-ctor, independent of copyFrom.  As a
result, today the only invocation of copyFrom is from the assignment
operator.  Simplify this.
2020-12-26 07:54:44 +01:00
Akim Demaille
734ce73bf2 glr2.cc: fix warnings about uninitialized locations
With GCC10, the CI shows tons of warnings such as
(327. actions.at:374: testing Initial location: glr2.cc):

    input.cc: In member function 'YYRESULTTAG glr_stack::yyglrReduce(state_set_index, rule_num, bool)':
    input.cc:1357:11: error: '<anonymous>.glr_state::yyloc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     1357 |     yyloc = other.yyloc;
          |     ~~~~~~^~~~~~~~~~~~~

This is because we don't have the constructors for locations.  But we
should have them!  That's only because of glr.cc that ctors were not
enabled by default.  In glr2.cc, they should.

That fixes all the warnings when Bison's locations are used.  However,
when user-defined locations without constructor are used, we still
have:

    550. calc.at:1409: testing Calculator glr2.cc %locations api.location.type={Span}  ...
    calc.cc: In member function 'YYRESULTTAG glr_stack::yyglrReduce(state_set_index, rule_num, bool)':
    calc.cc:1261:11: error: '<anonymous>.glr_state::yyloc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     1261 |     yyloc = other.yyloc;
          |     ~~~~~~^~~~~~~~~~~~~

To address this case, we need glr_state to explicily initialize its
yyloc member.

* data/skeletons/glr2.cc: Use genuine objects, with ctors, for position
and location.
(glr_state): Explicitly initialize yyloc in the constructors.
2020-12-26 07:54:05 +01:00
Akim Demaille
baea8cf9fa glr2.cc: provide glr_state with a genuine copy-constructor
The copy constructor was (lazily) implemented by a call to copyFrom.
Unfortunately copyFrom reads yyresolved from the destination (and
source), and in the case of the copy-ctor this is random garbagge,
which UBSAN catches:

    glr-regr2a.cc:1072:10: runtime error: load of value 7, which is not a valid value for type 'bool'

Rather than defining yyresolved before calling copyFrom, let's just
provide a genuine cpy-ctor for glr_state.

* data/skeletons/glr2.cc (glr_state::glr_state): Implement properly.
2020-12-25 11:04:53 +01:00
Akim Demaille
636c9a8042 glr2.cc: beware of self-assignment
In yycompressStack:

    while (yyr != YY_NULLPTR)
      {
        nextFreeItem->check_ ();
        yyr->check_();
        nextFreeItem->setState(*yyr);
        glr_state& nextFreeState = nextFreeItem->getState();
        yyr = yyr->pred();
        nextFreeState.setPred(&(nextFreeItem - 1)->getState());
        setFirstTop(&nextFreeState);
        ++nextFreeItem;
      }

it is possible that nextFreeItem and yyr are actually the same state.
In which case `nextFreeItem->setState(*yyr)` does really bad things.

* data/skeletons/glr2.cc (glr_stack_item::setState): Beware of
self-assignment.
2020-12-25 11:04:53 +01:00
Akim Demaille
7dc942dca3 glr: comment changes
* data/skeletons/glr.c: A bit more doc.
(yypstates): Rename yyst (only occurrence) to yys (commonly used for
yyGLRState).
* data/skeletons/glr2.cc: Ditto.
Prefer '\n' to "\n".
2020-12-25 11:04:33 +01:00
Adela Vais
32bb53870b d: remove unnecessary methods from the Lexer interface
The complete symbol approach in yylex removes the need for the methods
semanticVal, startPos and endPos, which were used when the values were
reported separately.

* data/skeletons/lalr1.d: Here.
* doc/bison.texi: Remove sections about the three methods.
* examples/d/calc/calc.y, examples/d/simple/calc.y: Remove the unused methods.
* tests/calc.at, tests/d.at, tests/scanner.at: Test it.
2020-12-21 15:53:32 +01:00
Adela Vais
27109d9d4a d: use Location and Position aliases in the backend
* data/skeletons/lalr1.d: Here.
2020-12-21 15:53:27 +01:00
Adela Vais
2b4451c4af d: remove unnecessary comparison from YYParser.parse()
* data/skeletons/lalr1.d: Here.
2020-12-21 15:51:57 +01:00
Akim Demaille
c0f3b55b25 style: address syntax-check diagnostics
* examples/c/glr/c++-types.y: Formatting changes.
* po/POTFILES.in: Add missing files.
* src/reader.c: Remove useless include.
* tests/calc.at: Avoid magic values for exit.
Obfuscate calls to error.
2020-12-21 07:51:02 +01:00
Adela Vais
20d657c1dd d: create alias Position for YYPosition
* data/skeletons/d.m4 (b4_public_types_declare): Here.
* data/skeletons/lalr1.d: Adjust.
* doc/bison.texi: Document it.
* examples/d/calc/calc.y: Use it.
* tests/calc.at: Test it.
2020-12-21 07:16:25 +01:00