Commit Graph

1724 Commits

Author SHA1 Message Date
Akim Demaille
ffa46e6516 skeletons: clarify the tag of special tokens
From

    GRAM_EOF = 0,                  /* $end  */
    GRAM_ERRCODE = 1,              /* error  */
    GRAM_UNDEF = 2,                /* $undefined  */

to

    GRAM_EOF = 0,                  /* "end of file"  */
    GRAM_ERRCODE = 1,              /* error  */
    GRAM_UNDEF = 2,                /* "invalid token"  */

* src/output.c (symbol_tag): New.
Use it to pass the token names and the symbol tags to the skeletons.

* tests/input.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
ff50f6f223 skeletons: use "invalid token" instead of "$undefined"
* src/output.c (prepare_symbol_names): Also handle undeftoken.
* tests/actions.at, tests/calc.at, tests/regression.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
72c9fa4510 skeletons: use "end of file" instead of "$end"
The name "$end" is nice in the report, in particular it avoids that
pointed-rules (aka items) be too long.  It also helps keeping them
"standard".

But it is bad in error messages, we should report "end of file" (or
maybe "end of input", this is debatable).  So, unless the user already
defined the alias for the error token herself, make it "end of file".
It should even be translated if the user already translated some
tokens, so that there is now no strong reason to redefine the $end
token.

* src/output.c (prepare_symbol_names): Issue "end of file" instead of
"$end".

* data/skeletons/lalr1.java (yytnamerr_): Remove the renaming hack.

* build-aux/update-test: Accept files with names containing a "+",
such as c++.at.
* tests/actions.at, tests/c++.at, tests/conflicts.at,
* tests/glr-regression.at, tests/regression.at, tests/skeletons.at:
Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
a555b41990 diagnostics: replace "user token number" by "token code"
Yet, don't change the structure identifier to avoid introducing
conflicts in Vincent Imbimbo's PR (which, amusingly enough, is about
conflicts).

* src/symtab.c: here.
* tests/diagnostics.at, tests/input.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
ecf5cb7e0e c++: remove the yy prefix from some functions
yy::parser features a parse() function, not a yyparse() one.

* data/skeletons/lalr1.cc (yyreport_syntax_error)
(context::yyexpected_tokens): Rename as...
(report_syntax_error, context::expected_tokens): these.
2020-04-12 13:56:44 +02:00
Akim Demaille
e50de09886 tokens: properly define the YYEOF token kind
Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file.  As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.

Give the $end token a visible kind name, YYEOF.  Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.

* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
2020-04-12 13:56:44 +02:00
Akim Demaille
95421df67b tokens: define the "$undefined" token kind
* data/skeletons/bison.m4 (b4_symbol_token_kind): Give a definition to
$undefined.
(b4_token_visible_if): $undefined has an id.
* src/output.c (prepare_symbol_definitions): Stop lying: $undefined
_is_ a token.
* tests/input.at: Adjust.
2020-04-12 13:56:43 +02:00
Akim Demaille
a4ed94bc13 tokens: properly define the "error" token kind
There are people out there that do use YYERRCODE (the token kind of
the error token).  See for instance
3812012bb7/unixODBC-2.3.2/Drivers/nn/yylex.c.

Currently, YYERRCODE is defined by yacc.c in an adhoc way as a #define
in the *.c file only.  It belongs with the other token kinds.

YYERRCODE is not a nice name, it does not fit in our naming scheme.
YYERROR would be more logical, but it collides with the YYERROR macro.
Shall we keep the same name in all the skeletons?  Besides, to avoid
collisions in C, we need to apply the api prefix: YYERRCODE is
actually <PREFIX>ERRCODE.  This is not needed in the other languages.

* data/skeletons/bison.m4 (b4_symbol_token_kind): New.
Map the error token to "YYERRCODE".
* data/skeletons/yacc.c (YYERRCODE): Don't define it, it's handled by...
* src/output.c (prepare_symbol_definitions): this.
* tests/input.at (Redefining the error token): Check it.
2020-04-12 13:56:43 +02:00
Akim Demaille
ecd5cae2d4 c++: fix generated headers
A forthcoming commit (tokens: properly define the "error" token kind)
revealed a problem in the C++ generated headers: they are not
self-contained.  With this file:

    %language "c++"
    %define api.value.type variant

    %code {
      static int yylex (yy::parser::semantic_type *lvalp);
    }

    %token <int> X

    %%

    exp:
      X { printf ("x\n"); }
    ;

    %%

    void
    yy::parser::error (const std::string& m)
    {
      std::cerr << m << '\n';
    }

    static
    int yylex (yy::parser::semantic_type *lvalp)
    {
      static int const input[] = {yy::parser::token::X, 0};
      static int toknum = 0;
      return input[toknum++];
    }

    int
    main (int argc, char const* argv[])
    {
      yy::parser p;
      return p.parse ();
    }

the generated header fails to compile cleanly (foo.cc just #includes
the generated header):

    $ clang++-mp-9.0 -c -Wundefined-func-template foo.cc
    In file included from foo.cc:1:
    bar.tab.hh:550:12: warning: instantiation of function 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' required here, but no definition is available
          [-Wundefined-func-template]
        struct symbol_type : basic_symbol<by_type>
               ^
    bar.tab.hh:436:7: note: forward declaration of template entity is here
          basic_symbol (basic_symbol&& that);
          ^
    bar.tab.hh:550:12: note: add an explicit instantiation declaration to suppress this warning if 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' is explicitly instantiated
          in another translation unit
        struct symbol_type : basic_symbol<by_type>
               ^
    1 warning generated.

* data/skeletons/c++.m4 (b4_public_types_define): Move the
implementation of the basic_symbol move-ctor to...
(b4_public_types_define): here, its declaration.
* tests/headers.at (Sane headers): Use a declared token so that the
corresponding token constructor is declared.  Which triggers the
aforementioned issue.
2020-04-12 13:56:21 +02:00
Akim Demaille
00a654c8ad c++: improvements on symbol kinds
Instead of

    /// (Internal) symbol kind.
    enum symbol_kind_type
    {
      YYNTOKENS = 5, ///< Number of tokens.
      YYSYMBOL_YYEMPTY = -2,
      YYSYMBOL_YYEOF = 0,                      // END_OF_FILE
      YYSYMBOL_YYERROR = 1,                    // error
      YYSYMBOL_YYUNDEF = 2,                    // $undefined
      YYSYMBOL_TEXT = 3,                       // TEXT
      YYSYMBOL_NUMBER = 4,                     // NUMBER
      YYSYMBOL_YYACCEPT = 5,                   // $accept
      YYSYMBOL_result = 6,                     // result
      YYSYMBOL_list = 7,                       // list
      YYSYMBOL_item = 8                        // item
    };

generate

    /// Symbol kinds.
    struct symbol_kind
    {
      enum symbol_kind_type
      {
        YYNTOKENS = 5, ///< Number of tokens.
        S_YYEMPTY = -2,
        S_YYEOF = 0,                             // END_OF_FILE
        S_YYERROR = 1,                           // error
        S_YYUNDEF = 2,                           // $undefined
        S_TEXT = 3,                              // TEXT
        S_NUMBER = 4,                            // NUMBER
        S_YYACCEPT = 5,                          // $accept
        S_result = 6,                            // result
        S_list = 7,                              // list
        S_item = 8                               // item
      };
    };

* data/skeletons/c++.m4 (api.symbol.prefix): Define to S_.
Adjust all the uses.
(b4_public_types_declare): Nest the enum inside 'struct symbol_kind'.
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* tests/headers.at, tests/local.at: Adjust.
2020-04-10 18:35:29 +02:00
Akim Demaille
3dcfb4fd88 java: prefer null to YYSYMBOL_YYEMPTY
That's one nice benefit from using enums.

* data/skeletons/lalr1.java (YYSYMBOL_YYEMPTY): No longer define it.
Use 'null' instead.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-06 19:14:11 +02:00
Akim Demaille
c0ccb8e5b4 java: rename Lexer.yyreportSyntaxError as reportSyntaxError
* data/skeletons/lalr1.java: here.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-06 19:14:06 +02:00
Akim Demaille
79f967ac0d java: use getExpectedTokens, not yyexpectedTokens
* data/skeletons/lalr1.java, examples/java/calc/Calc.y, tests/local.at:
here.
2020-04-06 18:43:34 +02:00
Akim Demaille
cc68bbf799 bison: use consistently "token kind", not "token type"
* src/output.c, src/reader.c, src/scan-gram.l, src/tables.c: here.
2020-04-05 19:14:39 +02:00
Akim Demaille
ff2fc62138 d, java: rename SymbolType as SymbolKind
See https://lists.gnu.org/r/bison-patches/2020-04/msg00031.html.

* data/skeletons/d.m4, data/skeletons/lalr1.d,
* data/skeletons/java.m4, data/skeletons/lalr1.java
(SymbolType): Rename as...
(SymbolKind): this.
Adjust dependencies.
2020-04-05 14:56:19 +02:00
Akim Demaille
2c05fc750a c, c++: rename yysymbol_type_t as yysymbol_kind_t
See https://lists.gnu.org/r/bison-patches/2020-04/msg00031.html

* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/yacc.c
(yysymbol_type_t): Rename as...
(yysymbol_kind_t): this.
Adjust dependencies.
* data/skeletons/c++.m4, data/skeletons/glr.cc, data/skeletons/lalr1.cc
(symbol_type_type): Rename as...
(symbol_kind_type): this.
Adjust dependencies.
2020-04-05 14:56:18 +02:00
Akim Demaille
7aee4586ca Merge branch 'maint'
* maint:
  maint: post-release administrivia
  version 3.5.4
  examples: reccalc: really compile cleanly in C99
  news: announce that Bison 3.6 drops YYERROR_VERBOSE
  news: update for 3.5.4
  style: fix spellos
  typo: succesful -> successful
  package: improve the readme
  java: check and fix support for api.token.raw
  java: style: prefer 'int[] foo' to 'int foo[]'
  build: fix syntax-check issues
  tests: recheck: work properly when the test suite was interrupted
  doc: c++: promote api.token.raw
  build: fix compatibility with old compilers
  examples: reccalc: compile cleanly in C99
2020-04-05 09:38:15 +02:00
Akim Demaille
225a67321b news: update the yyreport_syntax_error example
* examples/c/bistromathic/parse.y, tests/local.at
(yyreport_syntax_error): Fix use of YYSYMBOL_YYEMPTY.
* NEWS: Update.
2020-04-05 08:56:23 +02:00
Akim Demaille
76e11b5a3e c: rename yyparse_context_t as yypcontext_t
The first name is too long.  We already have `yypstate`, so
`yypcontext` is ok.  We are also migrating to using `*_t` for our
types.

* NEWS, data/skeletons/glr.c, data/skeletons/yacc.c, doc/bison.texi,
* examples/c/bistromathic/parse.y, src/parse-gram.y, tests/local.at:
(yyparse_context_t, yyparse_context_location, yyparse_context_token):
Rename as...
(yypcontext_t, yypcontext_location, yypcontext_token): these.
2020-04-04 19:20:29 +02:00
Akim Demaille
ad31c3cdf4 java: use SymbolType
The Java enums are very different from the C model.  As a consequence,
one cannot "build" an enum directly from an integer, we must retrieve
it.  That's the purpose of the SymbolType.get class method.

* data/skeletons/java.m4 (b4_symbol_enum, b4_case_code_symbol)
(b4_declare_symbol_enum): New.
* data/skeletons/lalr1.java: Use SymbolType,
SymbolType.YYSYMBOL_YYEMPTY, etc.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-04 16:42:33 +02:00
Adrian Vogelsgesang
1c273826d4 typo: succesful -> successful
* tests/calc.at: Here.
2020-04-04 10:56:47 +02:00
Akim Demaille
72f04ca80f java: check and fix support for api.token.raw
* tests/local.at (AT_LANG_MATCH, AT_YYERROR_DECLARE(java))
(AT_YYERROR_DECLARE_EXTERN(java), AT_PARSER_CLASS): New.
(AT_MAIN_DEFINE(java)): Use AT_PARSER_CLASS.
* tests/scanner.at: Add a test for Java.
* data/skeletons/lalr1.java (yytranslate_): Cast the result.
2020-04-04 10:34:53 +02:00
Akim Demaille
cb40f5c624 build: fix syntax-check issues
* src/system.h, tests/local.mk: Fix indentation.
2020-04-04 08:04:11 +02:00
Akim Demaille
6c23b012b9 tests: recheck: work properly when the test suite was interrupted
* tests/local.mk (recheck): Look at the per-test logs, not the overall
log, which, when interrupted, contains only information about... the
tests that passed.
2020-04-02 07:32:48 +02:00
Akim Demaille
7e28dbea11 c++: also use symbol_type_type
Because of the insane current implementation of glr.cc, things are a
bit nasty.  We will rename symbol_number_type as symbol_type_type
later, to keep this commit small.

* data/skeletons/c++.m4 (b4_declare_symbol_enum): New.
Also define YYNTOKENS to avoid type clashes when yyntokens_ was
actually defined in another enum.
Use it.
(symbol_number_type): Be an alias of symbol_type_type.
Use YYSYMBOL_YYEMPTY and the like.
Use symbol_number_type where appropriate.
(empty_symbol): Remove.
(yytranslate_): Use symbol_number_type, not token_number_type.
* data/skeletons/lalr1.cc: Use symbol_number_type where appropriate.
Adjust to the replacement of empty_symbol by YYSYMBOL_YYEMPTY.
(yy_error_token_, yy_undef_token_, yyeof_, yyntokens_): Remove.
Adjust dependencies.

* data/skeletons/glr.cc: Use symbol_number_type where appropriate.
Forward definitions of YYSYMBOL_YYEMPTY, etc. to glr.c.

* tests/headers.at: Accept YYNTOKENS and other YYSYMBOL_*.
* tests/local.at (AT_YYERROR_DEFINE(c++)): Use symbol_number_type.
2020-04-01 08:32:50 +02:00
Akim Demaille
086506bf23 glr.c, yacc.c: propagate yysymbol_type_t
Now that yacc.c and glr.c both know yysymbol_type_t, convert the
common routines.

* data/skeletons/c.m4 (yydestruct, yy_symbol_value_print)
(yy_symbol_print): Use yysymbol_type_t instead of int.
* data/skeletons/glr.c: Use yySymbol where appropriate.
* data/skeletons/yacc.c (YY_ACCESSING_SYMBOL): New wrapper around
yystos.
Use it.
* tests/local.at (yyreport_syntax_error): Use yysymbol_type_t where
appropriate.
2020-04-01 08:31:48 +02:00
Akim Demaille
9434571f95 yacc.c: revert to not using yysymbol_type_t in the yytranslate table
This triggers warnings with several compilers.  For instance ICC fills
the logs with pages and pages of

    input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}"
             0,     2,     2,     2,     2,     2,     2,     2,     2,     2,
             ^

    input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}"
             0,     2,     2,     2,     2,     2,     2,     2,     2,     2,
                    ^

And so does G++9 when compiling yacc.c's (C) output

    input.c:545:8: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive]
      545 |        0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
          |        ^
          |        |
          |        int
    input.c:545:15: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive]
      545 |        0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
          |               ^
          |               |
          |               int

Clang++ is no exception

    input.c:545:8: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int'
           0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
           ^
    input.c:545:15: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int'
           0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
                  ^

At some point we could use yysymbol_type_t's enumerators to define
yytranslate.  Meanwhile...

* data/skeletons/yacc.c (yytranslate): Use the original integral type
to define it.
(YYTRANSLATE): Cast the result into yysymbol_type_t.
2020-04-01 08:31:48 +02:00
Akim Demaille
f3c18c8e80 yacc.c: also define a symbol number for the empty token
This is not only cleaner, it also protects us from mixing signed
values (YYEMPTY is #defined as -2) with unsigned types (the
yysymbol_type_t enum is typically compiled as a small unsigned).
For instance GCC 9:

    input.c: In function 'yyparse':
    input.c:1107:7: error: conversion to 'unsigned int' from 'int'
                           may change the sign of the result
                           [-Werror=sign-conversion]
     1107 |   yyn += yytoken;
          |       ^~
    input.c:1107:10: error: conversion to 'int' from 'unsigned int'
                            may change the sign of the result
                            [-Werror=sign-conversion]
     1107 |   yyn += yytoken;
          |          ^~~~~~~
    input.c:1108:47: error: comparison of integer expressions of
                            different signedness:
                            'yytype_int8' {aka 'const signed char'} and
                            'yysymbol_type_t' {aka 'enum yysymbol_type_t'}
                            [-Werror=sign-compare]
     1108 |   if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken)
          |                                               ^~
    input.c:702:25: error: operand of ?: changes signedness from 'int'
                           to 'unsigned int' due to unsignedness of
                           other operand [-Werror=sign-compare]
      702 | #define YYEMPTY         (-2)
          |                         ^~~~
    input.c:1220:33: note: in expansion of macro 'YYEMPTY'
     1220 |   yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar);
          |                                 ^~~~~~~
    input.c:1220:41: error: unsigned conversion from 'int' to
                            'unsigned int' changes value
                            from '-2' to '4294967294'
                            [-Werror=sign-conversion]
     1220 |   yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar);
          |                                         ^

Eventually, it might be interesting to move away from -2 (which is the
only possible negative symbol number) and use the next available
number, to save bits.  We could actually even simply use "0" and shift
the rest, which would allow to write "!yytoken" to mean really
"yytoken != YYEMPTY".

* data/skeletons/c.m4 (b4_declare_symbol_enum): Define YYSYMBOL_YYEMPTY.
* data/skeletons/yacc.c: Use it.

* src/parse-gram.y (yyreport_syntax_error): Use YYSYMBOL_YYEMPTY, not
YYEMPTY, when dealing with a symbol.

* tests/regression.at: Adjust.
2020-04-01 08:31:48 +02:00
Akim Demaille
af19fd7e0f tests: recheck: work properly when the test suite was interrupted
* tests/local.mk (recheck): Look at the per-test logs, not the overall
log, which, when interrupted, contains only information about... the
tests that passed.
2020-03-30 08:41:12 +02:00
Akim Demaille
2c74872991 java: move away from _ for internationalization
The "_" is becoming a keyword in Java, which causes tons of warnings
currently in our test suite.  GNU Gettext is now using "i18n" instead
of "_"
(https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=e89fea36545f27487d9652a13e6a0adbea1117d0).

* data/skeletons/java.m4: Use "i18n", not "_".
* examples/java/calc/Calc.y, tests/calc.at: Adjust.
2020-03-30 08:03:10 +02:00
Akim Demaille
b7045aa706 java: make yysyntaxErrorArguments a private detail
* data/skeletons/lalr1.java (yysyntaxErrorArguments): Move it from the
context, to the parser object.
Generate only for detailed and verbose error messages.
* tests/local.at (AT_YYERROR_DEFINE(java)): Use yyexpectedTokens
instead.
2020-03-28 15:13:27 +01:00
Akim Demaille
1edc98f793 lalr1.cc: avoid using yysyntax_error_arguments
* data/skeletons/lalr1.cc (context::token): New.
* tests/local.at (yyreport_syntax_error): Don't use
yysyntax_error_arguments.
2020-03-28 15:13:27 +01:00
Akim Demaille
00b0d02955 tests: yacc.c: avoid yysyntax_error_arguments
Because glr.c shares the same testing routines, we also need to
convert it.

* data/skeletons/glr.c (yyparse_context_token): New.
* tests/local.at (yyreport_syntax_error): here.
2020-03-28 15:13:27 +01:00
Akim Demaille
1045c8d0ef examples: don't use yysyntax_error_arguments
Suggested by Adrian Vogelsgesang.
https://lists.gnu.org/archive/html/bison-patches/2020-02/msg00069.html

* data/skeletons/lalr1.java (Context.EMPTY, Context.getToken): New.
(Context.yyntokens): Rename as...
(Context.NTOKENS): this.
Because (i) all the Java coding styles recommend upper case for
constants, and (ii) the Java Skeleton exposes Lexer.EOF, not
Lexer.YYEOF.
* data/skeletons/yacc.c (yyparse_context_token): New.
* examples/c/bistromathic/parse.y (yyreport_syntax_error): Don't use
yysyntax_error_arguments.
* examples/java/calc/Calc.y (yyreportSyntaxError): Likewise.
2020-03-28 15:13:27 +01:00
Akim Demaille
84b1972c96 yacc.c: use negative numbers for errors in auxiliary functions
yyparse returns 0, 1, 2 since ages (accept, reject, memory exhausted).
Some of our auxiliary functions such as yy_lac and
yyreport_syntax_error also need to return error codes and also use 0,
1, 2.  Because it uses yy_lac, yyexpected_tokens also needs to return
"problem", "memory exhausted", but in case of success, it needs to
return the number of tokens, so it cannot use 1 and 2 as error code.
Currently it uses -1 and -2, which is later converted into 1 and 2 as
yacc.c expects it.

Let's simplify this and use consistently -1 and -2 for auxiliary
functions that are not exposed (or not yet exposed) to the user.  In
particular this will save the user from having to convert
yyexpected_tokens's -2 into yyreport_syntax_error's 2: both return -1
or -2.

* data/skeletons/yacc.c (yy_lac, yyreport_syntax_error)
(yy_lac_stack_realloc): Return -1, -2 for errors instead of 1, 2.
Adjust callers.
* examples/c/bistromathic/parse.y (yyreport_syntax_error): Do take
error codes into account.
Issue a syntax error message even if we ran out of memory.
* src/parse-gram.y, tests/local.at (yyreport_syntax_error): Adjust.
2020-03-23 07:02:36 +01:00
Akim Demaille
951da960e6 merge branch 'maint'
* upstream/maint:
  maint: post-release administrivia
  version 3.5.3
  news: update for 3.5.3
  yacc.c: make sure we properly propagated the user's number for error
  diagnostics: don't crash because of repeated definitions of error
  style: initialize some struct members
  diagnostics: beware of zero-width characters
  diagnostics: be sure to close the styling when lines are too short
  muscles: fix incorrect decoding of $
  code: be robust to reference with invalid tags
  build: fix typo
  doc: update recommandation for libtextstyle
  style: comment changes
  examples: use consistently the GFDL header for readmes
  style: remove useless declarations
  typo: succesful -> successful
  README: point to tests/bison, and document --trace
  gnulib: update
  maint: post-release administrivia
2020-03-08 10:13:16 +01:00
Akim Demaille
e3812bb8c3 yacc.c: make sure we properly propagated the user's number for error
* data/skeletons/yacc.c (YYERRCODE): Be truthful.
* tests/input.at (Redefining the error token): Check that.
2020-03-08 08:10:11 +01:00
Akim Demaille
cfcd823e16 diagnostics: don't crash because of repeated definitions of error
According to https://www.unix.com/man-page/POSIX/1posix/yacc/, the
user is allowed to specify her user number for the error token:

    The token error shall be reserved for error handling. The name
    error can be used in grammar rules. It indicates places where the
    parser can recover from a syntax error. The default value of error
    shall be 256. Its value can be changed using a %token
    declaration. The lexical analyzer should not return the value of
    error.

I think this feature is useless, the user should not have to deal with
that.  The intend is probably to give the user a means to use 256 if
she wants to, but provided "error" cleared the path first by being
assigned another number.  In the case of Bison, 256 is assigned to
"error" at the end if the user did not use it for a token of hers.  So
this feature is useless.

Yet it is valid, and if the user assigns twice a token number to
"error", then the second time we want to complain about it and want to
show the original definition.  At this point, we try to display the
built-in definition of "error", whose location is NULL, and we crash.

Rather, the location of the first user definition of "error" should
become its defining location.

Reported byg Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00007.html

* src/symtab.c (symbol_class_set): If this is a declaration and the
symbol was not declared yet, keep this as defining location.
* tests/input.at (Redefining the error token): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
b638603477 diagnostics: beware of zero-width characters
Currenly we rely on (visual) width of the characters to decide where
to open and close the styling of the quoted lines.  This breaks when
we deal with zero-width characters: we cannot just rely on (visual)
columns, we need to know whether we are before, inside, or after the
highlighted portion.

* src/location.c (location_caret): col_end: no longer add 1, "regular"
characters have a width of 1, only 0-width characters have 0-width.
opened: replace with 'state', a three-valued enum.
Don't reopen the style if we already did.
* tests/diagnostics.at (Zero-width characters): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
e21ff47f5d diagnostics: be sure to close the styling when lines are too short
bar.y:4.12-17: <error>error:</error> redefining user token number of foo
    -    4 | %token foo <error>123
    +    4 | %token foo <error>123</error>
           |            <error>^~~~~~</error>

* src/location.c (location_caret): Be sure to close.
* tests/diagnostics.at (Line is too short, and then you die): New.
2020-03-07 10:01:52 +01:00
Akim Demaille
b82b387da9 muscles: fix incorrect decoding of $
Bug introduced in 458171e6df.
https://lists.gnu.org/archive/html/bison-patches/2013-11/msg00009.html

Reported by Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00010.html

* src/muscle-tab.c (COMMON_DECODE): "$" is coded as "$][", not "$[][".
* tests/input.at ("%define" enum variables): Check that case.
2020-03-07 07:45:10 +01:00
Akim Demaille
641e326303 code: be robust to reference with invalid tags
Because we want to support $<a->b>$, we must accept -> in type tags,
and reject $<->$, as it is unfinished.
Reported by Ahcheong Lee.

* src/scan-code.l (yylex): Make sure "tag" does not end with -, since
-> does not close the tag.
* tests/input.at (Stray $ or @): Check this.
2020-03-06 17:29:26 +01:00
Akim Demaille
4fd3282dd7 style: formatting changes
* data/skeletons/yacc.c, tests/torture.at: here.
2020-03-04 08:24:36 +01:00
Adrian Vogelsgesang
c2cca46795 c++: add support for parse.error=custom
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added test cases
* tests/local.at: added yyreport_syntax_error implementation
   for C++ test cases
2020-02-27 18:13:44 +01:00
Adrian Vogelsgesang
72acecb30c c++: add support for parse.error=detailed
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added a test case
2020-02-27 18:13:43 +01:00
Adrian Vogelsgesang
368fcf0af5 typo: succesful -> successful
* data/skeletons/lalr1.cc: here
* etc/bench.pl.in: here
* src/location.c: here
* tests/calc.at: and here
2020-02-27 18:10:39 +01:00
Victor Morales Cayuela
e09a72eeb0 diagnostics: modernize the display of submessages
Since Bison 2.7, output was indented four spaces for explanatory
statements.  For example:

    input.y:2.7-13: error: %type redeclaration for exp
    input.y:1.7-11:     previous declaration

Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:

    input.y:2.7-13: error: %type redeclaration for exp
        2 | %type <float> exp
          |       ^~~~~~~
    input.y:1.7-11: note: previous declaration
        1 | %type <int> exp
          |       ^~~~~

* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
2020-02-15 08:28:40 +01:00
Akim Demaille
f3d33c3613 tests: check calls to yyerror from the user actions
This revealed a number of things I had not realized:

- the Java location tracking was aliasing the same pair of positions
  for all the symbols (see previous commit).

- in impure parsers, it's quite easy to use incorrect locations for
  diagnostics, since yyerror uses yylloc, which is the location of the
  lookahead, not that of the current lhs.  So we need something like

    {
      YYLTYPE old_yylloc = yylloc;
      yylloc = @$;
      yyerror (]AT_PARAM_IF([result, count, nerrs, ])[buf);
      yylloc = old_yylloc;
    }

  Maybe we should do that little yylloc dance in the skeleton instead
  of leaving it to the user?  It might be costly...  But that's only
  for users of the impure parsers, which are asking for trouble
  anyway.

- in glr.cc invoking yyerror is somewhat cumbersome: the C++ interface
  is not available as we are in yyparse (which in C), and yyerror is
  used by glr.cc itself to bind it to the user's parser::error.  If we
  call yyerror, we need:

    yyerror (]AT_LOCATION_IF([[&@$, ]])[yyparser, ]AT_PARAM_IF([result, count, nerrs, ])[msg);

  However calling yy::parser::error is easier, once we know that the
  current parser object is available as 'yyparser'.  Which also saves
  us from having to pass the parse-params ourselves:

    yyparser.error (]AT_LOCATION_IF([[@$, ]])[msg);

* tests/calc.at: Invoke yyerror by hand, instead of using fprintf etc.
Adjust expectations.
2020-02-12 00:00:05 +01:00
Akim Demaille
163a35d6dd java: beware not to alias the locations of the various symbols
* examples/java/calc/Calc.y, tests/calc.at, tests/local.at
(getStartPos, getEndPos): Always return a new object.
* doc/bison.texi: Clarify this.
2020-02-11 21:31:44 +01:00
Akim Demaille
cdb42f7730 java: check that parse.error custom|detailed work with push parsers
* tests/calc.at: here.
2020-02-11 08:39:08 +01:00