Commit Graph

1284 Commits

Author SHA1 Message Date
Akim Demaille
b7ddb1f224 yacc.c: avoid negated if
* data/skeletons/yacc.c: Prefer a "direct" conditional.
2019-01-05 15:09:28 +01:00
Akim Demaille
2471733f1a package: bump copyrights to 2019 2019-01-05 14:58:05 +01:00
Akim Demaille
c0c45cfa38 java/d: rename some %define variables for consistency
See 890ee8a1fd and
https://lists.gnu.org/archive/html/bison-patches/2019-01/msg00024.html.

* data/skeletons/d.m4, data/skeletons/java.m4
(abstract, annotations, extends, final, implements, public, strictfp):
Rename as...
(api.parser.abstract, api.parser.annotations, api.parser.extends)
(api.parser.final, api.parser.implements, api.parser.public)
(api.parser.strictfp):
these.

* src/muscle-tab.c (muscle_percent_variable_update): Ensure backward
compatibility.

* doc/bison.texi, examples/d/calc.y, examples/java/Calc.y,
tests/input.at: Adjust.
2019-01-05 12:28:55 +01:00
Akim Demaille
230d6c5160 java/d: remove useless macros
There are many macros that are defined and used just
once (b4_public_if, b4_abstract_if, etc.).  That's overkill.  Rather,
let's define a macro to build the "public class YYParser" line.

It appears that the same syntax with "extends", "abstract", etc. is
implemented in the D parser, which looks very fishy...

* data/skeletons/d.m4, data/skeletons/java.m4 (b4_public_if)
(b4_abstract_if, b4_final_if, b4_strictfp_if): Replace with
(b4_parser_class_declaration): this.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: Adjust.
2019-01-05 12:28:28 +01:00
Akim Demaille
84276bc3d5 glr.cc: fix the handling of syntax_error from the scanner
Commit 90a8537e62 was right, but issued
two error messages.  Commit 80ef7e7639
tried to address that by mapping yychar and yytoken to empty, but that
completely breaks the invariants of glr.c.  In particular, yygetToken
can be called repeatedly and is expected to return the latest result,
unless yytoken is YYEMPTY.  Since the previous attempt was "recording"
that the token was coming from an exception by setting it to YYEMPTY,
instead of getting again the faulty token, we fetched another one.

Rather, revert to the first approach: map yytoken to "invalid token",
but record in yychar the fact that we come from an exception thrown in
the scanner.

* data/skeletons/glr.c (YYFAULTYTOK): New.
(yygetToken): Use it to record syntax errors from the scanner.
* tests/c++.at (Syntax error as exception): In addition to checking
syntax_error with error recovery, make sure it also behaves as
expected without.
2019-01-05 10:15:33 +01:00
Akim Demaille
890ee8a1fd rename parser_class_name as api.parser.class
The previous name was historical and inconsistent.

* src/muscle-tab.c (define_directive): Use the proper value passing
syntax, based on the muscle kind.
(muscle_percent_variable_update): Use the right value passing syntax.
Migrate from parser_class_name to api.parser.class.

* data/skeletons: Migrate from parser_class_name to api.parser.class.

* doc/bison.texi (%define Summary): Document both parser_class_name
and api.parser.class.
Promote the latter over the former.
2019-01-02 19:14:32 +01:00
Akim Demaille
6d9818b0cf style: glr.c: prefer returning a value rather than passing pointers
This is very debatable.  This function is not pure at all, so it could
stick to returning void: that's a common coding style to tell the
difference between "real" (pure) functions and side-effecting
subroutines.  However, we already have this style elsewhere (e.g.,
yylex), and I feel the callers are somewhat nice to read this way.

* data/skeletons/glr.c (yygetLRActions): Return the action rather than
passing by pointer.
While at it, fix type of yytoken.
Adjust callers.
2019-01-02 12:08:04 +01:00
Akim Demaille
80ef7e7639 glr.cc: don't issue two error messages when syntax_error is thrown
Reported by Askar Safin.
https://lists.gnu.org/archive/html/bison-patches/2019-01/msg00000.html

* data/skeletons/glr.c (yygetToken): Return YYEMPTY when an exception
is thrown.
* data/skeletons/lalr1.cc: Log when an exception is caught.
* tests/c++.at (Syntax error as exception): Be sure to recover from
error before triggering another error.
2019-01-02 12:08:04 +01:00
Akim Demaille
5be47a73e8 skeletons: shorten b4_parser_class_name to b4_parser_class
* skeletons/c++.m4, skeletons/d.m4, skeletons/glr.c, skeletons/glr.cc,
* skeletons/java.m4, skeletons/lalr1.cc, skeletons/lalr1.d,
* skeletons/lalr1.java: Here.
2019-01-02 08:02:23 +01:00
Akim Demaille
0dfad676e3 glr.cc: remove duplicate definition of YYLLOC_DEFAULT
It's already provided by glr.c.

* data/skeletons/glr.cc (b4_post_prologue): Here.
2019-01-02 08:02:23 +01:00
Akim Demaille
d07564af63 style: remove stray empty lines
* data/skeletons/glr.c, data/skeletons/glr.cc: here.
* data/skeletons/bison.m4 (b4_glr_cc_if): Move it here.
2019-01-02 08:01:48 +01:00
Akim Demaille
90a8537e62 glr.cc: support syntax_error exceptions
Kindly requested by Аскар Сафин (Askar Safin).
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00033.html

* data/skeletons/glr.c (b4_glr_cc_if): New.
Use it.
(yygetToken): Catch syntax_errors.
* data/skeletons/glr.cc (YY_EXCEPTIONS): New.
* tests/c++.at: Check it.
2018-12-31 07:48:09 +01:00
Akim Demaille
6653c912da glr.c: factor the calls to yylex
The call protocol of yylex is quite complex, and repeated three
times.  Let's factor it.

* data/skeletons/glr.c (yygetToken): New.
Use it.
2018-12-31 07:31:27 +01:00
Akim Demaille
5bcd4292bb style: reduce scopes in glr.c
* data/skeletons/glr.c (yyrecoverSyntaxError): here.
2018-12-31 07:29:50 +01:00
Akim Demaille
7ff7ef678c c++: inline the implementation of syntax_error in its definition
This way, it is easier to make sure its implementation is available in
glr.cc too, which is not the case currently.

* data/skeletons/c++.m4 (b4_public_types_define): Move the
implementation of syntax_error...
(b4_public_types_declare): here.
2018-12-30 10:16:03 +01:00
Akim Demaille
0dc44adbf6 parsers: fix minor stylistic issues
* data/skeletons/variant.hh (b4_token_constructor_declare): Remove,
unused since the previous commit.
Fix indentation issues.
* data/skeletons/c++.m4: Fix indentation issues.
2018-12-27 18:23:49 +01:00
Akim Demaille
5fb0d276b3 c++: variants: fuse declarations and definitions
We used to create a short definition of yy::parser with all the
implementations of its member functions outside.  But yy::parser is no
longer short and simple to read.  Maintaining each function twice is
painful: a lot of redundancy but different indentation levels, output
which depends on whether we are in a header or not (see
d132c2d545), etc.

Let's simplify this and put the implementations into the class
definition itself.

Discussed in this monologue:
https://lists.gnu.org/archive/html/bison-patches/2018-12/msg00058.html.

* data/skeletons/c++.m4, data/skeletons/lalr1.cc,
* data/skeletons/variant.hh (b4_basic_symbol_constructor_define)
(_b4_token_constructor_declare, b4_token_constructor_declare)
Merge into...
(b4_basic_symbol_constructor_define, _b4_token_constructor_define)
(b4_token_constructor_define): these.
2018-12-26 09:12:25 +01:00
Akim Demaille
f44fcd30ea c++: move stack<T> inside yy::parser
We used to define such auxiliary structures outside the class, mainly
as a matter of style to keep the definition of yy::parser short and
simple.  However, now there's a lot more code generated inside the
class definition (e.g., all the token constructors), so the
readability no longer applies.

However, if we move stack (and slice) inside yy::parser, then it
should no longer be needed to change the namespace to have multiple
parsers: changing the class name should suffice.

One common argument against inner classes is that they code bloat.  It
hardly applies here, since typically different parsers will have
different semantic value types, hence different actual stack types.

* data/skeletons/lalr1.cc: Invoke b4_stack_define inside yy::parser.
2018-12-26 08:24:38 +01:00
Akim Demaille
112ccb5ed7 package: move skeletons into data/skeletons
* data/bison.m4, data/c++-skel.m4, data/c++.m4, data/c-like.m4,
* data/c-skel.m4, data/c.m4, data/d-skel.m4, data/d.m4, data/glr.c,
* data/glr.cc, data/java-skel.m4, data/java.m4, data/lalr1.cc,
* data/lalr1.d, data/lalr1.java, data/location.cc, data/stack.hh,
* data/variant.hh, data/yacc.c:
Move to...
* data/skeletons: here.
Use b4_skeletonsdir instead of b4_pkgdatadir.

* data/local.mk, src/output.c: Adjust.
2018-12-25 07:47:51 +01:00
Akim Demaille
0a4ddce822 c++: style: use consistently this/that instead of this/other
* data/lalr1.cc, data/variant.hh: here.
2018-12-24 19:05:00 +01:00
Akim Demaille
10591c8879 c++: also provide a copy constructor for symbol_type
Suggested by Wolfgang Thaller.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00081.html

* data/c++.m4 (basic_symbol, by_type): Instead of provide either move
or copy constructor, always provide the copy one.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check it.
2018-12-24 19:03:32 +01:00
Akim Demaille
807bf60cfc c++: fix double free when a symbol_type was moved
Currently the following piece of code crashes (with parse.assert),
because we don't record that s was moved-from, and we invoke its dtor.

    {
      auto s = parser::make_INT (42);
      auto s2 = std::move (s);
    }

Reported by Wolfgang Thaller.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00077.html

* data/c++.m4 (by_type): Provide a move-ctor.
(basic_symbol): Be sure not to read a moved-from value.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check this case.
2018-12-24 18:58:56 +01:00
Akim Demaille
9858165c52 c++: style: use consistently this/that instead of this/other
* data/c++.m4: here.
2018-12-24 18:55:24 +01:00
Akim Demaille
0978148763 c++: style: rename a few macros for clarity
* data/c++.m4, data/lalr1.cc, data/variant.hh:
s/b4_symbol_constructor/b4_token_constructor/g, as this is really what
is being defined.
2018-12-22 18:28:23 +01:00
Akim Demaille
e5780041b9 c++: exhibit a safe symbol_type
Instead of introducing make_symbol (whose name, btw, somewhat
infringes on the user's "name space", if she defines a token named
"symbol"), let's make the construction of symbol_type safer, using
assertions.

For instance with:

    %token ':' <std::string> ID <int> INT;

generate:

    symbol_type (int token, const std::string&);
    symbol_type (int token, const int&);
    symbol_type (int token);

It does mean that now named token constructors (make_ID, make_INT,
etc.) go through a useless assert, but I think we can ignore this: I
assume any decent compiler will inline the symbol_type ctor inside the
make_TOKEN functions, which will show that the assert is trivially
verified, hence I expect no code will be emitted for it.  And anyway,
that's an assert, NDEBUG controls it.

* data/c++.m4 (symbol_type): Turn into a subclass of
basic_symbol<by_type>.
Declare symbol constructors when variants are enabled.
* data/variant.hh (_b4_type_constructor_declare)
(_b4_type_constructor_define): Replace with...
(_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
Generate symbol_type constructors.
* doc/bison.texi (Complete Symbols): Document.
* tests/types.at: Check.
2018-12-22 14:55:07 +01:00
Akim Demaille
1f4dd2671a c++: provide symbol constructors per type
On

    %token <int> FOO BAR

we currently generate make_FOO(int) and make_BAR(int).  However, in
order to factor their scanners, some users would also like to have
make_symbol(tok, int), where tok is FOO or BAR.  To ensure type
safety, add assertions that do check that value type and token type
match.  Bind this assertion to the parse.assert %define variable.

Suggested by Frank Heckenbach.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00034.html
Should also match expectations from Аскар Сафин.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00023.html

* data/variant.hh: Use b4_token_visible_if where applicable.
(_b4_type_constructor_declare, _b4_type_constructor_define): New.
Use them.
2018-12-22 13:05:28 +01:00
Akim Demaille
34c52d10ac c++: style changes
* data/c++.m4, data/variant.hh: Improve layout of the generated code.
Avoid casts.
(_b4_symbol_constructor_declare, _b4_symbol_constructor_define): Rename
as...
(_b4_token_maker_declare, _b4_token_maker_define): these.
* tests/types.at: Improve pair printing.
2018-12-22 13:05:28 +01:00
Akim Demaille
6e9f9fcafc style: use b4_token_visible_if
And other formatting/comment changes.

* data/variant.hh: Here.
2018-12-19 07:23:40 +01:00
Akim Demaille
98d199ccc8 c++: fix token constructors for types with commas
Bitten by macros, again.
See 680b715518.

* data/variant.hh (_b4_symbol_constructor_declare)
(_b4_symbol_constructor_define): Do not use user types, which can
include commas as in `std::pair<int, int>`, to macros.

* tests/local.at: Adjust the lex related macros to support the
case of token constructors.
* tests/types.at: Also check token constructors on types with commas.
2018-12-19 06:40:28 +01:00
Akim Demaille
25b9eada8c symbols: check the previous commit
* tests/input.at (Symbol declarations): New.
2018-12-16 12:27:28 +01:00
Akim Demaille
d68f05d75c style: s/non-terminal/nonterminal/
I personally prefer 'non terminal', or 'non-terminal', but
'nonterminal' is the common spelling.

* data/glr.c, src/parse-gram.y, src/symtab.c, src/symtab.h,
* tests/input.at, doc/refcard.tex: here.
2018-12-11 06:55:41 +01:00
Akim Demaille
81dbd0d82e C++: support variadic emplace
Suggested by Askar Safin.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00006.html

* data/variant.hh: Implement.
* tests/types.at: Check.
* doc/bison.texi: Document.
2018-12-10 17:50:12 +01:00
Akim Demaille
e1a843cc69 d: fix double definition of YYSemanticType
* data/lalr1.d: When moving to b4_user_union_members, it also defines
b4_tag_seen_flag, so we had two definitions.
2018-12-08 08:05:00 +01:00
Akim Demaille
10e3ccac05 d: fix use of b4_union_members
* data/lalr1.d: Use b4_user_union_members instead.
2018-12-06 06:27:33 +01:00
Akim Demaille
3d5059f431 style: comment changes
* data/variant.hh: here.
2018-12-06 06:27:33 +01:00
Akim Demaille
cfd682f46d d, java: compute static subtractions
* data/d.m4, data/java.m4: Use b4_subtract where appropriate.
2018-12-05 06:02:01 +01:00
Akim Demaille
0ebcae4a54 d: update the skeleton
* data/d.m4, data/lalr1.d: Catch up with Bison.
And actually, also catch up with D.
2018-12-04 20:43:01 +01:00
Akim Demaille
4a42a4f911 d: add skeleton for the D language
Contributed by Oliver Mangold.
https://lists.gnu.org/archive/html/help-bison/2012-01/msg00000.html

* README-D.txt, d-skel.m4, d.m4, lalr1.d: New.
2018-12-04 20:29:28 +01:00
Akim Demaille
999277ddd8 skeletons: start some technical documentation
* data/README: Convert to Markdown.
Start documenting some of the macros used in all our skeletons.
Simplify and fix the documentation of the macros in the skeletons.
2018-12-04 08:36:52 +01:00
Akim Demaille
c44a782a4e backend: revamp the handling of symbol types
Currently it is the front end that passes the symbol types to the
backend.  For instance:

  %token <ival> NUM
  %type <ival> exp1 exp2
  exp1: NUM { $$ = $1; }
  exp2: NUM { $<ival>$ = $<ival>1; }

In both cases, $$ and $1 are passed to the backend as having type
'ival' resulting in code like `val.ival`.  This is troublesome in the
case of api.value.type=union, since in that the case the code this:

  %define api.value.type union
  %token <int> NUM
  %type <int> exp1 exp2
  exp1: NUM { $$ = $1; }
  exp2: NUM { $<int>$ = $<int>1; }

because in this case, since the backend does not know the symbol being
processed, it is forced to generate casts in both cases: *(int*)(&val)`.
This is unfortunate in the first case (exp1) where there is no reason
at all to use a cast instead of `val.NUM` and `val.exp1`.

So instead delegate the computation of the actual value type to the
backend: pass $<ival>$ as `symbol-number, ival` and $$ as
`symbol-number, MULL`, instead of passing `ival` before.

* src/scan-code.l (handle_action_dollar): Find the symbol the action
is about, not just its tyye.  Pass both symbol-number, and explicit
type tag ($<tag>n when there is one) to b4_lhs_value and b4_rhs_value.

* data/bison.m4 (b4_symbol_action): adjust to the new signature to
b4_dollar_pushdef.

* data/c-like.m4 (_b4_dollar_dollar, b4_dollar_pushdef): Accept the
symbol-number as new argument.

* data/c.m4 (b4_symbol_value): Accept the symbol-number as new
argument, and use it.
(b4_symbol_value_union): Accept the symbol-number as new
argument, and use it to prefer ready a union member rather than
casting the union.
* data/yacc.c (b4_lhs_value, b4_rhs_value): Accept the new
symbol-number argument.
Adjust uses of b4_dollar_pushdef.
* data/glr.c (b4_lhs_value, b4_rhs_value): Adjust.

* data/lalr1.cc (b4_symbol_value_template, b4_lhs_value): Adjust
to the new symbol-number argument.
* data/variant.hh (b4_symbol_value, b4_symbol_value_template): Accept
the new symbol-number argument.

* data/java.m4 (b4_symbol_value, b4_rhs_data): New.
(b4_rhs_value): Use them.
* data/lalr1.java: Adjust to b4_dollar_pushdef, and use b4_rhs_data.
2018-12-03 18:40:26 +01:00
Akim Demaille
e40db8976c style: comment and formatting changes
* data/bison.m4, data/c++.m4, data/glr.c, data/java.m4, data/lalr1.cc,
* data/yacc.c, src/scan-code.l:
Fix comments.
Prefer POS to denote the position of a symbol in a rule, since NUM
is also used to denote symbol numbers.
2018-12-03 08:42:26 +01:00
Akim Demaille
e76a934853 c++: don't define variant<S>, directly define semantic_type
Instead of defining yy::variant<S> and then alias
yy::parser::semantic_type to variant<sizeof (union_type)>, directly
define yy::parser::semantic_type.

This model is more appropriate if we want to sit the storage on top of
unions in C++11.

* data/variant.hh (b4_variant_define): Specialize and inline the
definition into...
(b4_value_type_declare): Here.
Define union_type here.
* data/lalr1.cc: Adjust.
2018-12-03 05:40:46 +01:00
Akim Demaille
6ef788f810 C++: use noexcept and constexpr
There are probably more opportunities for them.
So far, I observed no performance improvements.

* data/c++.m4, data/lalr1.cc, data/stack.hh: here.
2018-12-01 12:54:42 +01:00
Akim Demaille
cc050fd321 warning: avoid warnings about unreachable code
Reported by Uxio Prego.
https://lists.gnu.org/archive/html/help-bison/2018-11/msg00031.html

We also need to move the unreachable 'goto' to a reachable place,
otherwise clang complains about the code being unreachable anyway.
See also https://bugs.llvm.org/show_bug.cgi?id=39736.

Interestingly, we don't have to apply that trick to
`#define YYCDEBUG if (false) std::cerr`, clang does not warn when the
code comes from macro expansion.

* configure.ac: Use -Wunreachable-code when supported.
* data/lalr1.cc, data/yacc.c: Pacify clang's warning about `if (0)`
by using a macro.
Another possibility was to move this statement to a reachable place.
* tests/actions.at, tests/c++.at: Avoid generating unreachable code.
2018-11-25 11:22:31 +01:00
Akim Demaille
660811a6c5 yacc.c: avoid generating dead code
We should probably introduce some struct and functions to deal with
stack management, rather than open coding it.  yyparse would be much
nicer to read, and a better model for possible other skeletons.

* data/yacc.c (yyparse::yysetstate): Avoid generating code when
neither yyoverflow nor YYSTACK_RELOCATE is defined.
2018-11-24 13:26:27 +01:00
Akim Demaille
dee62718ae remove ancient lint directives
* data/c++.m4, data/yacc.c: Remove surprising remains of lint
directives.
2018-11-21 08:59:38 +01:00
Akim Demaille
6bc54a934e style: harmonize the labels of yyparse
* data/glr.c, data/lalr1.cc, data/yacc.c: Fix indentation and
other formatting issues.
2018-11-20 20:52:58 +01:00
Akim Demaille
4e510c69b1 c++: using macros around user types breaks when they include comma
We may generate code such as

    basic_symbol (typename Base::kind_type t, YY_RVREF (std::pair<int,int>) v);

which, of course, breaks, because YY_RVREF sees two arguments.  Let's
not play tricks with _VA_ARGS__, I'm unsure about it portability.
Anyway, I plan to change more things in this area.

Reported by Sébastien Villemot.
http://lists.gnu.org/archive/html/bug-bison/2018-11/msg00014.html

* data/variant.hh (b4_basic_symbol_constructor_declare)
(b4_basic_symbol_constructor_define): Don't use macro on user types.
* tests/types.at: Check that we support pairs.
2018-11-20 20:01:50 +01:00
Akim Demaille
8474dbc09e glr.c: fix use of _Noreturn
In C++, [[noreturn]] must not be between "static" and the rest of the
function signature, it must precede it.  C's _Noreturn does not seem
to have such a constraint, but it is therefore compatible with the C++
constraint.  Since we #define _Noreturn as [[noreturn]] is modern C++,
be sure to push the _Noreturn first.

Unfortunately this was not caught by the test suite, because it always
loads config.h first, and config.h contains another definition of
_Noreturn that does not use [[noreturn]], and hides ours.  That's
probably a sign we should avoid always loading config.h.

* data/glr.c (yyFail, yyMemoryExhausted): here.
2018-11-16 17:37:47 +01:00
Akim Demaille
037eff335b c++: use YY_CPLUSPLUS
* data/c++.m4: here.
2018-11-14 21:25:29 +01:00