Commit Graph

377 Commits

Author SHA1 Message Date
Akim Demaille
d0571c846f java: fix coding style
I don't plan to fix everything in one go.  But this was in the way of
the next commit.

* data/skeletons/lalr1.java: Avoid space before parens.
* tests/java.at: Adjust.
2020-05-02 09:27:16 +02:00
Akim Demaille
d55c9b001a java: add missing i18n requests
* data/skeletons/lalr1.java (reportSyntaxError): Here.
2020-05-01 10:36:05 +02:00
Akim Demaille
611495999f java: style: fix coding style of yyerror/reportSyntaxError
* data/skeletons/lalr1.java: here.
2020-05-01 10:36:05 +02:00
Akim Demaille
01d5f232a9 java: avoid useless work
* data/skeletons/lalr1.java (yySymbolPrint): Avoid the computation of
the argument if useless.
While at it, fix Java coding style.
2020-05-01 10:36:05 +02:00
Akim Demaille
0407acbc59 java: comment changes
* data/skeletons/lalr1.java, examples/java/calc/Calc.y: here.
2020-05-01 10:36:05 +02:00
Akim Demaille
30357ae942 c++: use modern idioms to make classes non-copyable
Reported by Don Macpherson.
https://lists.gnu.org/r/bug-bison/2019-05/msg00015.html
https://github.com/akimd/bison/issues/36

* data/skeletons/lalr1.cc, data/skeletons/stack.hh,
* data/skeletons/variant.hh: Delete the copy-ctor and the copy operator.
2020-05-01 06:52:04 +02:00
Akim Demaille
fb1d76d9a9 yacc.c: avoid the use of a temporary
* data/skeletons/yacc.c: Use YYLLOC_DEFAULT directly with the final
destination.
2020-04-30 08:07:55 +02:00
Akim Demaille
3b05de2d05 yacc.c: install backward compatibility for YYERRCODE
Some people have been using that symbol.  Some even have #defined it
themselves.
https://lists.gnu.org/r/bison-patches/2020-04/msg00138.html

Let's provide backward compatibility, having it point to YYUNDEF, so
that an error message is generated.

* data/skeletons/yacc.c (YYERRCODE): New, at the exact same location
it was defined before.
2020-04-28 08:26:49 +02:00
Akim Demaille
902a235ad3 style: c++: s/type/kind/ where appropriate
These are internal details.  `type_get ()` is still there to ensure
backward compatibility, `kind ()` being the modern way.

* data/skeletons/c++.m4 (by_type, by_type::type): Rename as...
(by_kind, by_kind::kind_): this.
Adjust dependencies.
2020-04-28 08:16:05 +02:00
Akim Demaille
11027558c8 java: clean up the definition of token kinds
From

    public interface Lexer {
      /* Token kinds.  */
      /** Token number, to be returned by the scanner.  */
      static final int YYEOF = 0;
      /** Token number, to be returned by the scanner.  */
      static final int YYERRCODE = 256;
      /** Token number, to be returned by the scanner.  */
      static final int YYUNDEF = 257;
      /** Token number, to be returned by the scanner.  */
      static final int BANG = 258;
    ...
      /** Deprecated, use b4_symbol(0, id) instead.  */
      public static final int EOF = YYEOF;

to

    public interface Lexer {
      /* Token kinds.  */
      /** Token "end of file", to be returned by the scanner.  */
      static final int YYEOF = 0;
      /** Token error, to be returned by the scanner.  */
      static final int YYerror = 256;
      /** Token "invalid token", to be returned by the scanner.  */
      static final int YYUNDEF = 257;
      /** Token "!", to be returned by the scanner.  */
      static final int BANG = 258;
    ...
      /** Deprecated, use YYEOF instead.  */
      public static final int EOF = YYEOF;

* data/skeletons/java.m4 (b4_token_enum): Display the symbol's tag in
comment.
* data/skeletons/lalr1.java: Address overquotation issue.
* examples/java/calc/Calc.y, examples/java/simple/Calc.y: Use YYEOF,
not EOF.
2020-04-28 07:56:00 +02:00
Akim Demaille
cd4e799da4 error: rename the error token from YYERRCODE to YYerror
See https://lists.gnu.org/r/bison-patches/2020-04/msg00162.html.

* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.java, doc/bison.texi,
* examples/c/bistromathic/parse.y, src/scan-gram.l, src/symtab.c
(YYERRCODE): Rename as...
(YYerror): this.
Adjust dependencies.
2020-04-28 07:54:07 +02:00
Akim Demaille
b254b36db8 all: don't emit an error message when the scanner returns YYERRCODE
I'm quite pleased to see that the tricky case of glr.c was already
prepared by the changes to support syntax_error exceptions.  Better
yet, it is actually syntax_error that becomes a special case of the
general pattern: make yytoken be YYERRCODE.

* data/skeletons/glr.c (YYFAULTYTOK): Remove the now useless (Basil)
Faulty token.
Instead, use the error token.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: When computing
the action, first check the case of the error token.

* tests/calc.at: Check cases for the error token symbols before and
after it.
2020-04-26 19:55:52 +02:00
Akim Demaille
58e79539fc c: don't emit an error message when the scanner returns YYERRCODE
* data/skeletons/yacc.c (yyparse): When the scanner returns YYERRCODE,
go directly to error recovery (yyerrlab1).
However, don't keep the error token as lookahead, that token is too
special.
* data/skeletons/lalr1.cc: Likewise.

* examples/c/bistromathic/parse.y (yylex): Use that feature to report
nicely invalid characters.
* examples/c/bistromathic/bistromathic.test: Check that.
* examples/test: Neutralize gratuitous differences such as rule
position.

* tests/calc.at: Check that case in C only.
The other case seem to be working, but that's an illusion that the
next commit will address (in fact, they can enter endless loops, and
report the error several times anyway).
2020-04-26 18:05:30 +02:00
Akim Demaille
7eabe1c70b c++: make valid to print the empty symbol
* data/skeletons/lalr1.cc (yy_print_): here.
2020-04-26 15:09:52 +02:00
Akim Demaille
7fec669e42 c++: always define symbol_name
* data/skeletons/lalr1.cc (symbol_name): Always define it, even when
it's actually yytname which is used.
2020-04-26 15:09:52 +02:00
Akim Demaille
cbbbe12e02 c++: fix a few style issues
* data/skeletons/lalr1.cc (yystack_print_, yy_reduce_print_): Add
missing const.
(yystack_print_): Rename as...
(yy_stack_print_): this.
* data/skeletons/glr.cc (yy_symbol_value_print_, yy_symbol_print_):
Add missing const.
2020-04-26 15:09:52 +02:00
Akim Demaille
286d0755f8 all: prefer YYERRCODE to YYERROR
We will not keep YYERRCODE anyway, it causes backward compatibility
issues.  So as a first step, let all the skeletons use that name,
until we have a better one.

* data/skeletons/bison.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c, doc/bison.texi, tests/headers.at,
* tests/input.at:
here.
2020-04-26 15:09:51 +02:00
Akim Demaille
1c3d79b871 style: glr.c: clarify
* data/skeletons/glr.c: Make the code a bit clearer.
2020-04-26 14:49:26 +02:00
Akim Demaille
9c72d3c5a8 style: prefer b4_has_translations_if
* data/skeletons/glr.c, data/skeletons/yacc.c: here.
2020-04-26 13:34:30 +02:00
Akim Demaille
bc083be055 style: glr.c: fix indentation issue
* data/skeletons/glr.c (yyparse): here.
2020-04-26 10:57:23 +02:00
Akim Demaille
22eeb1ab8a style: fix a few remaining 'type' instead of 'kind'
* data/skeletons/glr.c, data/skeletons/yacc.c (YY_SYMBOL_PRINT):
Here.
2020-04-26 10:57:22 +02:00
Akim Demaille
c4dbc1776c skeletons: make the warning about implementation details clearer
* data/skeletons/bison.m4 (b4_disclaimer): Here.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: Use it.
2020-04-26 10:57:02 +02:00
Akim Demaille
b74fc07d21 style: c: fix a few minor issues about indentation of cpp directives
* README-hacking.md: More about cpp.
* data/skeletons/c.m4, data/skeletons/yacc.c: Style changes.
2020-04-25 12:16:57 +02:00
Akim Demaille
150dc95395 style: clarify #endif
We could try to avoid the weird "#if 1", but then the indentation of
the inner #if would be wrong.  Let' keep it this way.

* data/skeletons/yacc.c: here.
Also, avoid sticking the comment to the directive.
2020-04-25 11:06:16 +02:00
Akim Demaille
bb7c4a5508 style: minor fixes
* data/skeletons/bison.m4, doc/bison.texi: Spell check.
* examples/c/bistromathic/parse.y (N_): Remove, now useless.
2020-04-25 08:00:08 +02:00
Akim Demaille
81334eb5a0 c, c++: provide a default definition for N_
In C/C++, N_ is a no-op.  Define it if the user didn't.
Suggested by Frank Heckenbach.
https://lists.gnu.org/r/bug-bison/2020-04/msg00010.html

* src/output.c (prepare_symbol_names): Rename has_translations as
has_translations_flag.
* data/skeletons/bison.m4 (b4_has_translations_if): New.
* data/skeletons/java.m4 (b4_trans): Use it.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c
(N_): Provide a default definition.
2020-04-20 07:37:45 +02:00
Akim Demaille
9b7e7077dd style: fix comments
* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c: here.
2020-04-19 15:40:12 +02:00
Akim Demaille
d6ae95fb50 c++: give public access to the symbol kind
symbol_type::token () was removed: it returned the token kind of a
symbol.  To do that, one needs to convert from the symbol kind to the
token kind, which requires a table.

This broke some users' unit tests for scanners, see
https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html
https://lists.gnu.org/r/bug-bison/2020-03/msg00020.html
https://lists.gnu.org/r/help-bison/2020-04/msg00005.html

Instead of making this possible again, let's check the symbol's kind
instead.  So give proper access to a symbol's kind.

That feature existed, undocumented, as 'type_get()'.  Let's rename
this as 'kind()'.

* data/skeletons/c++.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc (type_get): Rename as...
(kind): This.
(type_get): Install a backward compatibility alias.
* doc/bison.texi (Complete Symbols): Document symbol_type and
symbol_type::kind.
2020-04-18 08:03:59 +02:00
Akim Demaille
e86b14069d doc: token_kind_type in C++
* data/skeletons/c++.m4: Define the old names in terms on the new
ones, instead of the converse.
* doc/bison.texi (C++ Parser Interface): Be more extensive about
token_kind_type.
2020-04-17 08:53:37 +02:00
Akim Demaille
caadfc552b skeletons: use symbol(-2, kind)
Not all the symbols have a fixed symbol code.  UNDEF's one is fixed:
-2.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c: here.
2020-04-16 07:35:06 +02:00
Akim Demaille
c4c25e091c style: comments changes about error handling
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: here.
* data/skeletons/lalr1.cc: Reduce scope.
2020-04-16 07:34:37 +02:00
Akim Demaille
758172a8b9 doc: spell check
* doc/bison.texi, NEWS, README-hacking.md: here.
And elsewhere.
2020-04-13 18:50:05 +02:00
Akim Demaille
dab08da605 java: promote YYEOF rather that Lexer.EOF
* doc/bison.texi: here.
* data/skeletons/lalr1.java: Use YYEOF.
2020-04-13 17:08:53 +02:00
Akim Demaille
8cedb4b40e java: fix names
* data/skeletons/lalr1.java (yySymbolPrint): There are no pointers
here, remove the `p` suffix.
Use the appropriate type for locations.
2020-04-13 17:04:34 +02:00
Akim Demaille
258c2c967f doc: java: SymbolKind, etc.
Why didn't I think about this before???  symbolName should be a method
of SymbolKind.

* data/skeletons/lalr1.java (YYParser::yysymbolName): Move as...
* data/skeletons/java.m4 (SymbolKind::getName): this.
Make the table a static final table, not a local variable.
Adjust dependencies.
* doc/bison.texi (Java Parser Interface): Document i18n.
(Java Parser Context Interface): Document SymbolKind.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-13 16:54:48 +02:00
Akim Demaille
42ab6c1e44 doc: c++: document parser::context
* doc/bison.texi (C++ Parser Context): New.

* data/skeletons/lalr1.cc (parser::yysymbol_name): Rename as...
(parser::symbol_name): this.
(A Complete C++ Example): Promote LAC, now that we have it.
Promote parse.error detailed over verbose.
* examples/c++/calc++/calc++.test, tests/local.at: Adjust.
2020-04-13 16:54:14 +02:00
Akim Demaille
71e3f6d4da d: put YYEMPTY in the TokenKind
* data/skeletons/d.m4, data/skeletons/lalr1.d (b4_token_enums): Rename
YYTokenType as TokenKind.
Define YYEMPTY.
* examples/d/calc.y, tests/calc.at, tests/scanner.at: Adjust.
2020-04-13 16:49:54 +02:00
Akim Demaille
64aec0a8d8 c, c++: also define YYEMPTY in yytoken_kind_t
I have been hesitating a lot before doing it ---after all the user
must not use this kind, so what's the point of showing it in
yytoken_kind_t.  And eventually I chose to play it safe with the
typing system and make it possible to use yytoken_kind_t for all the
tokens, even the "empty token".

* data/skeletons/c.m4: Give an id and a tag to YYEMPTY.
(b4_token_enums): Define YYEMPTY.
* data/skeletons/c++.m4 (b4_token_enums): Define YYEMPTY.
* data/skeletons/glr.c, data/skeletons/glr.cc, data/skeletons/yacc.c:
(YYEMPTY): Remove.
Use b4_symbol(-2, id) instead.
2020-04-13 16:49:48 +02:00
Akim Demaille
7a226860ef doc: promote yytoken_kind_t, not yytokentype
* data/skeletons/c.m4 (yytoken_kind_t): New.
* data/skeletons/c++.m4, data/skeletons/lalr1.cc (yysymbol_kind_type):
New.
* examples/c/lexcalc/parse.y, examples/c/reccalc/parse.y,
* tests/regression.at:
Use them.
* doc/bison.texi: Replace "enum yytokentype" by "yytoken_kind_t".
(api.token.raw): Explain that it forces "yytoken_kind_t" to coincide
with "yysymbol_kind_t".
(Calling Convention): Mention YYEOF.
(Table of Symbols): Add entries for "yytoken_kind_t" and
"yysymbol_kind_t".
(Glossary): Add entries for "Kind", "Token kind" and "Symbol kind".
2020-04-12 19:24:12 +02:00
Akim Demaille
5839f4d289 c: rename yyexpected_tokens as yypcontext_expected_tokens
The user should think of yypcontext fields as accessible only via
yypcontext_* functions.  So let's rename yyexpected_tokens to reflect
that.

Let's _not_ rename yyreport_syntax_error, as the user may define this
function, and is not allowed to access directly the fields of
yypcontext_t: she *must* use the "accessors".  This is comparable to
the case of C++/Java where the user defines
parser::report_syntax_error, not parser::context::report_syntax_error.

* data/skeletons/glr.c, data/skeletons/yacc.c (yyexpected_tokens):
Rename as...
(yypcontext_expected_tokens): this.
Adjust dependencies.
2020-04-12 19:23:40 +02:00
Akim Demaille
72c9fa4510 skeletons: use "end of file" instead of "$end"
The name "$end" is nice in the report, in particular it avoids that
pointed-rules (aka items) be too long.  It also helps keeping them
"standard".

But it is bad in error messages, we should report "end of file" (or
maybe "end of input", this is debatable).  So, unless the user already
defined the alias for the error token herself, make it "end of file".
It should even be translated if the user already translated some
tokens, so that there is now no strong reason to redefine the $end
token.

* src/output.c (prepare_symbol_names): Issue "end of file" instead of
"$end".

* data/skeletons/lalr1.java (yytnamerr_): Remove the renaming hack.

* build-aux/update-test: Accept files with names containing a "+",
such as c++.at.
* tests/actions.at, tests/c++.at, tests/conflicts.at,
* tests/glr-regression.at, tests/regression.at, tests/skeletons.at:
Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
ecf5cb7e0e c++: remove the yy prefix from some functions
yy::parser features a parse() function, not a yyparse() one.

* data/skeletons/lalr1.cc (yyreport_syntax_error)
(context::yyexpected_tokens): Rename as...
(report_syntax_error, context::expected_tokens): these.
2020-04-12 13:56:44 +02:00
Akim Demaille
e50de09886 tokens: properly define the YYEOF token kind
Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file.  As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.

Give the $end token a visible kind name, YYEOF.  Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.

* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
2020-04-12 13:56:44 +02:00
Akim Demaille
95421df67b tokens: define the "$undefined" token kind
* data/skeletons/bison.m4 (b4_symbol_token_kind): Give a definition to
$undefined.
(b4_token_visible_if): $undefined has an id.
* src/output.c (prepare_symbol_definitions): Stop lying: $undefined
_is_ a token.
* tests/input.at: Adjust.
2020-04-12 13:56:43 +02:00
Akim Demaille
a4ed94bc13 tokens: properly define the "error" token kind
There are people out there that do use YYERRCODE (the token kind of
the error token).  See for instance
3812012bb7/unixODBC-2.3.2/Drivers/nn/yylex.c.

Currently, YYERRCODE is defined by yacc.c in an adhoc way as a #define
in the *.c file only.  It belongs with the other token kinds.

YYERRCODE is not a nice name, it does not fit in our naming scheme.
YYERROR would be more logical, but it collides with the YYERROR macro.
Shall we keep the same name in all the skeletons?  Besides, to avoid
collisions in C, we need to apply the api prefix: YYERRCODE is
actually <PREFIX>ERRCODE.  This is not needed in the other languages.

* data/skeletons/bison.m4 (b4_symbol_token_kind): New.
Map the error token to "YYERRCODE".
* data/skeletons/yacc.c (YYERRCODE): Don't define it, it's handled by...
* src/output.c (prepare_symbol_definitions): this.
* tests/input.at (Redefining the error token): Check it.
2020-04-12 13:56:43 +02:00
Akim Demaille
07726f1178 tokens: style: minor fixes
* data/skeletons/bison.m4 (b4_symbol_kind): Dispatch on the UNDEF
token number rather than its name.
* data/skeletons/c++.m4, data/skeletons/c.m4, data/skeletons/java.m4:
Comment changes.
2020-04-12 13:56:43 +02:00
Akim Demaille
e78596955d glr.cc: remove dead code
* data/skeletons/glr.cc: here.
2020-04-12 13:56:43 +02:00
Akim Demaille
ecd5cae2d4 c++: fix generated headers
A forthcoming commit (tokens: properly define the "error" token kind)
revealed a problem in the C++ generated headers: they are not
self-contained.  With this file:

    %language "c++"
    %define api.value.type variant

    %code {
      static int yylex (yy::parser::semantic_type *lvalp);
    }

    %token <int> X

    %%

    exp:
      X { printf ("x\n"); }
    ;

    %%

    void
    yy::parser::error (const std::string& m)
    {
      std::cerr << m << '\n';
    }

    static
    int yylex (yy::parser::semantic_type *lvalp)
    {
      static int const input[] = {yy::parser::token::X, 0};
      static int toknum = 0;
      return input[toknum++];
    }

    int
    main (int argc, char const* argv[])
    {
      yy::parser p;
      return p.parse ();
    }

the generated header fails to compile cleanly (foo.cc just #includes
the generated header):

    $ clang++-mp-9.0 -c -Wundefined-func-template foo.cc
    In file included from foo.cc:1:
    bar.tab.hh:550:12: warning: instantiation of function 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' required here, but no definition is available
          [-Wundefined-func-template]
        struct symbol_type : basic_symbol<by_type>
               ^
    bar.tab.hh:436:7: note: forward declaration of template entity is here
          basic_symbol (basic_symbol&& that);
          ^
    bar.tab.hh:550:12: note: add an explicit instantiation declaration to suppress this warning if 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' is explicitly instantiated
          in another translation unit
        struct symbol_type : basic_symbol<by_type>
               ^
    1 warning generated.

* data/skeletons/c++.m4 (b4_public_types_define): Move the
implementation of the basic_symbol move-ctor to...
(b4_public_types_define): here, its declaration.
* tests/headers.at (Sane headers): Use a declared token so that the
corresponding token constructor is declared.  Which triggers the
aforementioned issue.
2020-04-12 13:56:21 +02:00
Akim Demaille
8dcc25a1e4 style: rename YYNOMEM as YYENOMEM
This is clearer.

* data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): Rename as...
(YYENOMEM): here.
2020-04-10 18:35:29 +02:00
Akim Demaille
00a654c8ad c++: improvements on symbol kinds
Instead of

    /// (Internal) symbol kind.
    enum symbol_kind_type
    {
      YYNTOKENS = 5, ///< Number of tokens.
      YYSYMBOL_YYEMPTY = -2,
      YYSYMBOL_YYEOF = 0,                      // END_OF_FILE
      YYSYMBOL_YYERROR = 1,                    // error
      YYSYMBOL_YYUNDEF = 2,                    // $undefined
      YYSYMBOL_TEXT = 3,                       // TEXT
      YYSYMBOL_NUMBER = 4,                     // NUMBER
      YYSYMBOL_YYACCEPT = 5,                   // $accept
      YYSYMBOL_result = 6,                     // result
      YYSYMBOL_list = 7,                       // list
      YYSYMBOL_item = 8                        // item
    };

generate

    /// Symbol kinds.
    struct symbol_kind
    {
      enum symbol_kind_type
      {
        YYNTOKENS = 5, ///< Number of tokens.
        S_YYEMPTY = -2,
        S_YYEOF = 0,                             // END_OF_FILE
        S_YYERROR = 1,                           // error
        S_YYUNDEF = 2,                           // $undefined
        S_TEXT = 3,                              // TEXT
        S_NUMBER = 4,                            // NUMBER
        S_YYACCEPT = 5,                          // $accept
        S_result = 6,                            // result
        S_list = 7,                              // list
        S_item = 8                               // item
      };
    };

* data/skeletons/c++.m4 (api.symbol.prefix): Define to S_.
Adjust all the uses.
(b4_public_types_declare): Nest the enum inside 'struct symbol_kind'.
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* tests/headers.at, tests/local.at: Adjust.
2020-04-10 18:35:29 +02:00