Commit Graph

7055 Commits

Author SHA1 Message Date
Akim Demaille
e86b14069d doc: token_kind_type in C++
* data/skeletons/c++.m4: Define the old names in terms on the new
ones, instead of the converse.
* doc/bison.texi (C++ Parser Interface): Be more extensive about
token_kind_type.
2020-04-17 08:53:37 +02:00
Akim Demaille
5d983253f7 doc: updates for 3.6
* doc/bison.texi: More s/token type/token kind/.
* NEWS: Update.
2020-04-16 08:44:36 +02:00
Akim Demaille
caadfc552b skeletons: use symbol(-2, kind)
Not all the symbols have a fixed symbol code.  UNDEF's one is fixed:
-2.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c: here.
2020-04-16 07:35:06 +02:00
Akim Demaille
c4c25e091c style: comments changes about error handling
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: here.
* data/skeletons/lalr1.cc: Reduce scope.
2020-04-16 07:34:37 +02:00
Akim Demaille
fc18b4313a examples: bistro: don't be lazy with switch
* examples/c/bistromathic/parse.y (yylex): Use the switch to
discriminate all the cases.
2020-04-14 08:20:05 +02:00
Akim Demaille
758172a8b9 doc: spell check
* doc/bison.texi, NEWS, README-hacking.md: here.
And elsewhere.
2020-04-13 18:50:05 +02:00
Akim Demaille
e9454c3456 gnulib: update 2020-04-13 17:47:20 +02:00
Akim Demaille
8f01cf0269 doc: more about the coding style
* README-hacking.md: here.
(Troubleshooting): New.
2020-04-13 17:08:53 +02:00
Akim Demaille
dab08da605 java: promote YYEOF rather that Lexer.EOF
* doc/bison.texi: here.
* data/skeletons/lalr1.java: Use YYEOF.
2020-04-13 17:08:53 +02:00
Akim Demaille
8cedb4b40e java: fix names
* data/skeletons/lalr1.java (yySymbolPrint): There are no pointers
here, remove the `p` suffix.
Use the appropriate type for locations.
2020-04-13 17:04:34 +02:00
Akim Demaille
258c2c967f doc: java: SymbolKind, etc.
Why didn't I think about this before???  symbolName should be a method
of SymbolKind.

* data/skeletons/lalr1.java (YYParser::yysymbolName): Move as...
* data/skeletons/java.m4 (SymbolKind::getName): this.
Make the table a static final table, not a local variable.
Adjust dependencies.
* doc/bison.texi (Java Parser Interface): Document i18n.
(Java Parser Context Interface): Document SymbolKind.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-13 16:54:48 +02:00
Akim Demaille
9a33570493 style: java: get closer to the Java style
* examples/java/calc/Calc.y, examples/java/simple/Calc.y: here.
2020-04-13 16:54:14 +02:00
Akim Demaille
42ab6c1e44 doc: c++: document parser::context
* doc/bison.texi (C++ Parser Context): New.

* data/skeletons/lalr1.cc (parser::yysymbol_name): Rename as...
(parser::symbol_name): this.
(A Complete C++ Example): Promote LAC, now that we have it.
Promote parse.error detailed over verbose.
* examples/c++/calc++/calc++.test, tests/local.at: Adjust.
2020-04-13 16:54:14 +02:00
Akim Demaille
dc1035bada doc: promote YYEOF
* NEWS (Deep overhaul of the symbol and token kinds): New.
* doc/bison.texi: Promote YYEOF over "0" in scanners.
(Token Decl): No longer show YYEOF here, it now works by default.
(Token I18n): More details about YYEOF here.
(Calc++): Just use YYEOF.
2020-04-13 16:54:14 +02:00
Akim Demaille
71e3f6d4da d: put YYEMPTY in the TokenKind
* data/skeletons/d.m4, data/skeletons/lalr1.d (b4_token_enums): Rename
YYTokenType as TokenKind.
Define YYEMPTY.
* examples/d/calc.y, tests/calc.at, tests/scanner.at: Adjust.
2020-04-13 16:49:54 +02:00
Akim Demaille
3877b7210e regen 2020-04-13 16:49:54 +02:00
Akim Demaille
64aec0a8d8 c, c++: also define YYEMPTY in yytoken_kind_t
I have been hesitating a lot before doing it ---after all the user
must not use this kind, so what's the point of showing it in
yytoken_kind_t.  And eventually I chose to play it safe with the
typing system and make it possible to use yytoken_kind_t for all the
tokens, even the "empty token".

* data/skeletons/c.m4: Give an id and a tag to YYEMPTY.
(b4_token_enums): Define YYEMPTY.
* data/skeletons/c++.m4 (b4_token_enums): Define YYEMPTY.
* data/skeletons/glr.c, data/skeletons/glr.cc, data/skeletons/yacc.c:
(YYEMPTY): Remove.
Use b4_symbol(-2, id) instead.
2020-04-13 16:49:48 +02:00
Akim Demaille
5e2e9af56d doc: use "code", not "number", for token (and symbol) kinds
"Number" is too much about arithmethics.  "Code" conveys better the
"enum" nature of token kinds.  And of symbol kinds.

* doc/bison.texi: Here.
2020-04-12 19:24:44 +02:00
Akim Demaille
7a226860ef doc: promote yytoken_kind_t, not yytokentype
* data/skeletons/c.m4 (yytoken_kind_t): New.
* data/skeletons/c++.m4, data/skeletons/lalr1.cc (yysymbol_kind_type):
New.
* examples/c/lexcalc/parse.y, examples/c/reccalc/parse.y,
* tests/regression.at:
Use them.
* doc/bison.texi: Replace "enum yytokentype" by "yytoken_kind_t".
(api.token.raw): Explain that it forces "yytoken_kind_t" to coincide
with "yysymbol_kind_t".
(Calling Convention): Mention YYEOF.
(Table of Symbols): Add entries for "yytoken_kind_t" and
"yysymbol_kind_t".
(Glossary): Add entries for "Kind", "Token kind" and "Symbol kind".
2020-04-12 19:24:12 +02:00
Akim Demaille
c973361138 doc: document yypcontext_t, and api.symbol.prefix
* doc/bison.texi (%define Summary): Document api.symbol.prefix.
(Syntax Error Reporting Function): Document yypcontext_t,
yypcontext_location, yypcontext_token, yypcontext_expected_tokens, and
yysymbol_kind_t.
2020-04-12 19:24:02 +02:00
Akim Demaille
5839f4d289 c: rename yyexpected_tokens as yypcontext_expected_tokens
The user should think of yypcontext fields as accessible only via
yypcontext_* functions.  So let's rename yyexpected_tokens to reflect
that.

Let's _not_ rename yyreport_syntax_error, as the user may define this
function, and is not allowed to access directly the fields of
yypcontext_t: she *must* use the "accessors".  This is comparable to
the case of C++/Java where the user defines
parser::report_syntax_error, not parser::context::report_syntax_error.

* data/skeletons/glr.c, data/skeletons/yacc.c (yyexpected_tokens):
Rename as...
(yypcontext_expected_tokens): this.
Adjust dependencies.
2020-04-12 19:23:40 +02:00
Akim Demaille
ffa46e6516 skeletons: clarify the tag of special tokens
From

    GRAM_EOF = 0,                  /* $end  */
    GRAM_ERRCODE = 1,              /* error  */
    GRAM_UNDEF = 2,                /* $undefined  */

to

    GRAM_EOF = 0,                  /* "end of file"  */
    GRAM_ERRCODE = 1,              /* error  */
    GRAM_UNDEF = 2,                /* "invalid token"  */

* src/output.c (symbol_tag): New.
Use it to pass the token names and the symbol tags to the skeletons.

* tests/input.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
ff50f6f223 skeletons: use "invalid token" instead of "$undefined"
* src/output.c (prepare_symbol_names): Also handle undeftoken.
* tests/actions.at, tests/calc.at, tests/regression.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
05be0fef95 skeletons: make the eof token translatable if i18n is enabled
* src/output.c (has_translations): New.
(prepare_symbol_names): Translate endtoken if the user already
translated tokens.

* examples/c/bistromathic/parse.y, src/parse-gram.y: Simplify.
2020-04-12 13:56:44 +02:00
Akim Demaille
72c9fa4510 skeletons: use "end of file" instead of "$end"
The name "$end" is nice in the report, in particular it avoids that
pointed-rules (aka items) be too long.  It also helps keeping them
"standard".

But it is bad in error messages, we should report "end of file" (or
maybe "end of input", this is debatable).  So, unless the user already
defined the alias for the error token herself, make it "end of file".
It should even be translated if the user already translated some
tokens, so that there is now no strong reason to redefine the $end
token.

* src/output.c (prepare_symbol_names): Issue "end of file" instead of
"$end".

* data/skeletons/lalr1.java (yytnamerr_): Remove the renaming hack.

* build-aux/update-test: Accept files with names containing a "+",
such as c++.at.
* tests/actions.at, tests/c++.at, tests/conflicts.at,
* tests/glr-regression.at, tests/regression.at, tests/skeletons.at:
Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
a555b41990 diagnostics: replace "user token number" by "token code"
Yet, don't change the structure identifier to avoid introducing
conflicts in Vincent Imbimbo's PR (which, amusingly enough, is about
conflicts).

* src/symtab.c: here.
* tests/diagnostics.at, tests/input.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
ecf5cb7e0e c++: remove the yy prefix from some functions
yy::parser features a parse() function, not a yyparse() one.

* data/skeletons/lalr1.cc (yyreport_syntax_error)
(context::yyexpected_tokens): Rename as...
(report_syntax_error, context::expected_tokens): these.
2020-04-12 13:56:44 +02:00
Akim Demaille
e50de09886 tokens: properly define the YYEOF token kind
Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file.  As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.

Give the $end token a visible kind name, YYEOF.  Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.

* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
2020-04-12 13:56:44 +02:00
Akim Demaille
95421df67b tokens: define the "$undefined" token kind
* data/skeletons/bison.m4 (b4_symbol_token_kind): Give a definition to
$undefined.
(b4_token_visible_if): $undefined has an id.
* src/output.c (prepare_symbol_definitions): Stop lying: $undefined
_is_ a token.
* tests/input.at: Adjust.
2020-04-12 13:56:43 +02:00
Akim Demaille
a4ed94bc13 tokens: properly define the "error" token kind
There are people out there that do use YYERRCODE (the token kind of
the error token).  See for instance
3812012bb7/unixODBC-2.3.2/Drivers/nn/yylex.c.

Currently, YYERRCODE is defined by yacc.c in an adhoc way as a #define
in the *.c file only.  It belongs with the other token kinds.

YYERRCODE is not a nice name, it does not fit in our naming scheme.
YYERROR would be more logical, but it collides with the YYERROR macro.
Shall we keep the same name in all the skeletons?  Besides, to avoid
collisions in C, we need to apply the api prefix: YYERRCODE is
actually <PREFIX>ERRCODE.  This is not needed in the other languages.

* data/skeletons/bison.m4 (b4_symbol_token_kind): New.
Map the error token to "YYERRCODE".
* data/skeletons/yacc.c (YYERRCODE): Don't define it, it's handled by...
* src/output.c (prepare_symbol_definitions): this.
* tests/input.at (Redefining the error token): Check it.
2020-04-12 13:56:43 +02:00
Akim Demaille
07726f1178 tokens: style: minor fixes
* data/skeletons/bison.m4 (b4_symbol_kind): Dispatch on the UNDEF
token number rather than its name.
* data/skeletons/c++.m4, data/skeletons/c.m4, data/skeletons/java.m4:
Comment changes.
2020-04-12 13:56:43 +02:00
Akim Demaille
e78596955d glr.cc: remove dead code
* data/skeletons/glr.cc: here.
2020-04-12 13:56:43 +02:00
Akim Demaille
ecd5cae2d4 c++: fix generated headers
A forthcoming commit (tokens: properly define the "error" token kind)
revealed a problem in the C++ generated headers: they are not
self-contained.  With this file:

    %language "c++"
    %define api.value.type variant

    %code {
      static int yylex (yy::parser::semantic_type *lvalp);
    }

    %token <int> X

    %%

    exp:
      X { printf ("x\n"); }
    ;

    %%

    void
    yy::parser::error (const std::string& m)
    {
      std::cerr << m << '\n';
    }

    static
    int yylex (yy::parser::semantic_type *lvalp)
    {
      static int const input[] = {yy::parser::token::X, 0};
      static int toknum = 0;
      return input[toknum++];
    }

    int
    main (int argc, char const* argv[])
    {
      yy::parser p;
      return p.parse ();
    }

the generated header fails to compile cleanly (foo.cc just #includes
the generated header):

    $ clang++-mp-9.0 -c -Wundefined-func-template foo.cc
    In file included from foo.cc:1:
    bar.tab.hh:550:12: warning: instantiation of function 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' required here, but no definition is available
          [-Wundefined-func-template]
        struct symbol_type : basic_symbol<by_type>
               ^
    bar.tab.hh:436:7: note: forward declaration of template entity is here
          basic_symbol (basic_symbol&& that);
          ^
    bar.tab.hh:550:12: note: add an explicit instantiation declaration to suppress this warning if 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' is explicitly instantiated
          in another translation unit
        struct symbol_type : basic_symbol<by_type>
               ^
    1 warning generated.

* data/skeletons/c++.m4 (b4_public_types_define): Move the
implementation of the basic_symbol move-ctor to...
(b4_public_types_define): here, its declaration.
* tests/headers.at (Sane headers): Use a declared token so that the
corresponding token constructor is declared.  Which triggers the
aforementioned issue.
2020-04-12 13:56:21 +02:00
Akim Demaille
8dcc25a1e4 style: rename YYNOMEM as YYENOMEM
This is clearer.

* data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): Rename as...
(YYENOMEM): here.
2020-04-10 18:35:29 +02:00
Akim Demaille
1a5f74d2f6 todo: update 2020-04-10 18:35:29 +02:00
Akim Demaille
00a654c8ad c++: improvements on symbol kinds
Instead of

    /// (Internal) symbol kind.
    enum symbol_kind_type
    {
      YYNTOKENS = 5, ///< Number of tokens.
      YYSYMBOL_YYEMPTY = -2,
      YYSYMBOL_YYEOF = 0,                      // END_OF_FILE
      YYSYMBOL_YYERROR = 1,                    // error
      YYSYMBOL_YYUNDEF = 2,                    // $undefined
      YYSYMBOL_TEXT = 3,                       // TEXT
      YYSYMBOL_NUMBER = 4,                     // NUMBER
      YYSYMBOL_YYACCEPT = 5,                   // $accept
      YYSYMBOL_result = 6,                     // result
      YYSYMBOL_list = 7,                       // list
      YYSYMBOL_item = 8                        // item
    };

generate

    /// Symbol kinds.
    struct symbol_kind
    {
      enum symbol_kind_type
      {
        YYNTOKENS = 5, ///< Number of tokens.
        S_YYEMPTY = -2,
        S_YYEOF = 0,                             // END_OF_FILE
        S_YYERROR = 1,                           // error
        S_YYUNDEF = 2,                           // $undefined
        S_TEXT = 3,                              // TEXT
        S_NUMBER = 4,                            // NUMBER
        S_YYACCEPT = 5,                          // $accept
        S_result = 6,                            // result
        S_list = 7,                              // list
        S_item = 8                               // item
      };
    };

* data/skeletons/c++.m4 (api.symbol.prefix): Define to S_.
Adjust all the uses.
(b4_public_types_declare): Nest the enum inside 'struct symbol_kind'.
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* tests/headers.at, tests/local.at: Adjust.
2020-04-10 18:35:29 +02:00
Akim Demaille
6c5f690da4 d: improvements on symbol kinds
public enum SymbolKind
    {
      S_YYEMPTY = -2,    /* No symbol.  */
      S_YYEOF = 0,       /* $end  */
      S_YYERROR = 1,     /* error  */
      S_YYUNDEF = 2,     /* $undefined  */
      S_EQ = 3,          /* "="  */

* data/skeletons/d.m4 (api.symbol.prefix): Default to S_.
Output the symbol kind definitions with a comment.
2020-04-10 18:35:29 +02:00
Akim Demaille
007e1b5f0a symbols: minor fixes
* data/skeletons/bison.m4 (b4_symbol_kind): Series of _ are useless,
one is enough.
* data/skeletons/c.m4 (b4_token_enum): Fix overquoting.
2020-04-10 18:33:02 +02:00
Akim Demaille
bbb9750b3e skeletons: introduce api.symbol.prefix
* data/skeletons/bison.m4 (b4_symbol_prefix): New.
(b4_symbol_kind): Use it.
* data/skeletons/c++.m4, data/skeletons/c.m4, data/skeletons/d.m4
* data/skeletons/java.m4 (api.symbol.prefix): Provide a default value.

* data/skeletons/glr.c, data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java, data/skeletons/yacc.c:
Adjust: use b4_symbol_prefix instead of YYSYMBOL_.
2020-04-07 08:40:16 +02:00
Akim Demaille
52d0e77c2c java: also emit documenting comments for symbol kinds
* data/skeletons/java.m4 (b4_symbol_enum): here.
And stop defined YYSYMBOL_YYEMPTY, we no longer use it.
2020-04-07 08:39:54 +02:00
Akim Demaille
088d5668a9 todo: update
* TODO (YYERRCODE): Remove, handled by YYSYMBOL_ERROR.
2020-04-06 19:20:02 +02:00
Akim Demaille
87579e03e0 skeletons: beware not to use yyarg when it's null
Reported by Adrian Vogelsgesang.

* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Here.
2020-04-06 19:14:11 +02:00
Akim Demaille
11225a5d2f java: document new features
* data/skeletons/lalr1.java: More comments.
(Context.EMPTY): Remove.
* doc/bison.texi (Java Parser Context Interface): New.
2020-04-06 19:14:11 +02:00
Akim Demaille
3dcfb4fd88 java: prefer null to YYSYMBOL_YYEMPTY
That's one nice benefit from using enums.

* data/skeletons/lalr1.java (YYSYMBOL_YYEMPTY): No longer define it.
Use 'null' instead.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-06 19:14:11 +02:00
Akim Demaille
c0ccb8e5b4 java: rename Lexer.yyreportSyntaxError as reportSyntaxError
* data/skeletons/lalr1.java: here.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-06 19:14:06 +02:00
Akim Demaille
79f967ac0d java: use getExpectedTokens, not yyexpectedTokens
* data/skeletons/lalr1.java, examples/java/calc/Calc.y, tests/local.at:
here.
2020-04-06 18:43:34 +02:00
Akim Demaille
0f6ab8e692 java: style: fix coding style
* data/skeletons/java.m4: Indent by two.
* data/skeletons/lalr1.java (yynnts_): Remove.
(yyfinal_, yyntokens_, yylast_, yyempty_): Rename as...
(YYFINAL_, YYNTOKENS_, YYLAST_, YYEMPTY_): these, they are constants.
2020-04-06 18:43:34 +02:00
Akim Demaille
e657f04b62 c: make the symbol kind definition nicer to read
From

    enum yysymbol_kind_t
    {
      YYSYMBOL_YYEMPTY = -2,
      YYSYMBOL_YYEOF = 0,
      YYSYMBOL_YYERROR = 1,
      YYSYMBOL_YYUNDEF = 2,

to

    enum yysymbol_kind_t
    {
      YYSYMBOL_YYEMPTY = -2,
      YYSYMBOL_YYEOF = 0,                      /* "end of file"  */
      YYSYMBOL_YYERROR = 1,                    /* error  */
      YYSYMBOL_YYUNDEF = 2,                    /* $undefined  */

* data/skeletons/bison.m4 (b4_last_symbol): New.
(b4_symbol_enum, b4_symbol_enums): Reformat the output.
* data/skeletons/c.m4
2020-04-06 18:43:34 +02:00
Akim Demaille
10e61eec6d c: make the token kind definition nicer to read
From

    enum gram_tokentype
    {
      GRAM_EOF = 0,
      STRING = 3,
      TSTRING = 4,
      PERCENT_TOKEN = 5,

To

    enum gram_tokentype
    {
      GRAM_EOF = 0,                  /* "end of file"  */
      STRING = 3,                    /* "string"  */
      TSTRING = 4,                   /* "translatable string"  */
      PERCENT_TOKEN = 5,             /* "%token"  */

* data/skeletons/bison.m4 (b4_last_enum_token): New.
* data/skeletons/c.m4 (b4_token_enum, b4_token_enums): Show the
corresponding symbol.
2020-04-06 18:43:34 +02:00
Akim Demaille
149e280aab c: make the generated YYSTYPE nicer to read
From

    union GRAM_STYPE
    {
      /* precedence_declarator  */
      assoc precedence_declarator;
      /* "string"  */
      char* STRING;
      /* "translatable string"  */
      char* TSTRING;
      /* "{...}"  */
      char* BRACED_CODE;
      /* "%?{...}"  */

to

    union GRAM_STYPE
    {
      assoc precedence_declarator;             /* precedence_declarator  */
      char* STRING;                            /* "string"  */
      char* TSTRING;                           /* "translatable string"  */
      char* BRACED_CODE;                       /* "{...}"  */

* data/skeletons/c.m4 (b4_symbol_type_register): Use m4_format to
align the comments.
* src/parse-gram.h: Regen.
2020-04-06 18:43:34 +02:00