Commit Graph

1694 Commits

Author SHA1 Message Date
Akim Demaille
99bedadf23 m4: remove b4_function_define and b4_function_declare
* data/skeletons/c.m4: here.
2020-03-02 06:58:20 +01:00
Akim Demaille
79c3f2b8fd m4: decommission b4_function_declare
* data/skeletons/glr.c, data/skeletons/glr.cc, data/skeletons/yacc.c:
Stop using b4_function_declare.
2020-03-02 06:58:14 +01:00
Akim Demaille
4cca30d2e6 m4: decommission function generating macro
These macros have been extremely useful when we had to support K&R C,
which we dropped long ago.  Now, they merely make the code uselessly
hard to read.

* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/yacc.c:
Stop using b4_function_define.
2020-03-02 06:57:50 +01:00
Akim Demaille
b870d5fee4 c++: don't copy the lookahead
The current implementation of parser::context keeps a copy of the
lookahead.  This is troublesome since we support move-only types.
Besides, while GCC is happy with the current implementation, Clang
complains that the ctor it needs to build the copy of the lookahead is
not yet available.

    461. calc.at:1120: testing Calculator C++ %defines %locations parse.error=verbose %name-prefix "calc" %verbose  ...
    calc.at:1120: COLUMNS=1000; export COLUMNS;  bison --color=no -fno-caret -Wno-deprecated -o calc.cc calc.y
    calc.at:1120: $CXX $CXXFLAGS $CPPFLAGS  $LDFLAGS -o calc calc.cc calc-lex.cc calc-main.cc $LIBS
    stderr:
    In file included from calc-lex.cc:7:
    calc.hh:351:12: error: instantiation of function 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' required here, but no definition is available [-Werror,-Wundefined-func-template]
        struct symbol_type : basic_symbol<by_type>
               ^
    calc.hh:273:7: note: forward declaration of template entity is here
          basic_symbol (const basic_symbol& that);
          ^
    calc.hh:351:12: note: add an explicit instantiation declaration to suppress this warning if 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' is explicitly instantiated in another translation unit
        struct symbol_type : basic_symbol<by_type>
               ^
    1 error generated.
    In file included from calc-main.cc:7:
    calc.hh:351:12: error: instantiation of function 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' required here, but no definition is available [-Werror,-Wundefined-func-template]
        struct symbol_type : basic_symbol<by_type>
               ^
    calc.hh:273:7: note: forward declaration of template entity is here
          basic_symbol (const basic_symbol& that);
          ^
    calc.hh:351:12: note: add an explicit instantiation declaration to suppress this warning if 'calc::parser::basic_symbol<calc::parser::by_type>::basic_symbol' is explicitly instantiated in another translation unit
        struct symbol_type : basic_symbol<by_type>
               ^
    1 error generated.
    stdout:
    calc.at:1120: exit code was 1, expected 0
    461. calc.at:1120: 461. Calculator C++ %defines %locations parse.error=verbose %name-prefix "calc" %verbose  (calc.at:1120): FAILED (calc.at:1120)

* data/skeletons/lalr1.cc (context::yyla_): Make it a const-ref.
Move the implementation out of the declaration.
2020-02-27 21:49:56 +01:00
Akim Demaille
30d01b21e7 c++: minor fixes
Address compiler warnings such as

    warning: declaration of 'yyla' shadows a member of 'yy::parser::context' [-Wshadow]

* data/skeletons/lalr1.cc (context): Don't use the same names for
variables and members.
Use foo_ for private members, as in parser.
Also, use the + trick in array accesses to please ICC and provide it
with an int.
2020-02-27 21:31:09 +01:00
Adrian Vogelsgesang
c2cca46795 c++: add support for parse.error=custom
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added test cases
* tests/local.at: added yyreport_syntax_error implementation
   for C++ test cases
2020-02-27 18:13:44 +01:00
Adrian Vogelsgesang
879530cb95 c++: add parser::context for syntax error handling
* data/skeletons/lalr1.cc: here
2020-02-27 18:13:44 +01:00
Adrian Vogelsgesang
72acecb30c c++: add support for parse.error=detailed
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added a test case
2020-02-27 18:13:43 +01:00
Adrian Vogelsgesang
d4fcd5c3d0 skeletons: prefer b4_parse_error_{case,bmatch} over manual solution
Prefer b4_parse_error_case over the adhoc solution
`m4_case + b4_percent_define_get`. Same for b4_parse_error_bmatch.

* data/skeletons/glr.c: here
* data/skeletons/yacc.c: here
2020-02-27 18:13:43 +01:00
Adrian Vogelsgesang
368fcf0af5 typo: succesful -> successful
* data/skeletons/lalr1.cc: here
* etc/bench.pl.in: here
* src/location.c: here
* tests/calc.at: and here
2020-02-27 18:10:39 +01:00
Akim Demaille
1e25cb44d1 java: provide a Context ctor
This is really a private auxiliary inner class, so it should not
matter.  But it's better style.

* data/skeletons/lalr1.java: here.
2020-02-13 19:22:10 +01:00
Akim Demaille
9ed802a026 java: avoid trailing white spaces
* data/skeletons/java.m4 (b4_maybe_throws): Issue a space before when needed.
* data/skeletons/lalr1.java: Avoid trailing spaces.
2020-02-13 08:18:12 +01:00
Akim Demaille
fb554c2804 m4: fix b4_token_format
We used to emit:

    /** Token number,to be returned by the scanner.  */
    static final int NUM = 258;
    /** Token number,to be returned by the scanner.  */
    static final int NEG = 259;

with no space after the comma.  Fix that.

* data/skeletons/bison.m4 (b4_token_format): Quote where appropriate.
2020-02-13 08:18:11 +01:00
Akim Demaille
11e2b755f0 c++: simplify
* data/skeletons/stack.hh (ssize): Remove, same as size.
2020-02-12 00:09:15 +01:00
Akim Demaille
126252333d java: don't expose the Context's members
* data/skeletons/lalr1.java (Context): Make data members private.
(Context.getLocation): New.
* examples/java/calc/Calc.y, tests/java.at, tests/local.at: Adjust.
2020-02-11 08:39:03 +01:00
Akim Demaille
77bdcc6f0c parse.error: document and diagnose the incompatibility with %token-table
* doc/bison.texi (Tokens from Literals): Move to code using
%token-table to...
(Decl Summary: %token-table): here.
* data/skeletons/bison.m4: Implement mutual exclusion.
* tests/input.at: Check it.
* doc/local.mk: Be robust to the removal of doc/.
2020-02-10 20:15:46 +01:00
Akim Demaille
ed1eedbbcb java: revert "style: avoid useless initializers"
This reverts commit ebab1ffca8.
This commit removed "useless" initializers, going from

    /* YYPACT[STATE-NUM] -- Index in YYTABLE of the portion describing
       STATE-NUM.  */
    private static final byte yypact_[] = yypact_init ();
    private static final byte[] yypact_init ()
    {
      return new byte[]
      {
        25,    -7,    -8,    37,    -8,    40,    -8,    20,    -8,    61,
        -8,    -8,     3,     9,    51,    -8,    -8,    -2,    -2,    -2,
        -2,    -2,    -2,    -8,    -8,    -8,     1,    66,    66,     3,
         3,     3
      };
    }

to

    /* YYPACT[STATE-NUM] -- Index in YYTABLE of the portion describing
       STATE-NUM.  */
    private static final byte[] yypact_ =
    {
        25,    -7,    -8,    37,    -8,    40,    -8,    20,    -8,    61,
        -8,    -8,     3,     9,    51,    -8,    -8,    -2,    -2,    -2,
        -2,    -2,    -2,    -8,    -8,    -8,     1,    66,    66,     3,
         3,     3
    };

But it turns out that this was on purpose, to work around the 64KB
limitation in JVM methods.  It was introduced on the 2008-11-10 by
Di-an Jan in 09ccae9b18: "Work around
Java's ``code too large'' problem for parser tables".  See
https://lists.gnu.org/r/help-bison/2008-11/msg00004.html.  A real
test, where we would hit the JVM limitation, would be nice.

To avoid further regressions, add comments.
2020-02-10 07:34:07 +01:00
Akim Demaille
bc74b4b15a skeletons: avoid b4_error_verbose_if, which is confusing
parse.error has more than two possible values.

* data/skeletons/bison.m4 (b4_error_verbose_if, b4_error_verbose_flag):
Remove.
(b4_parse_error_case, b4_parse_error_bmatch): New.
Adjust dependencies.
2020-02-10 07:24:38 +01:00
Akim Demaille
8dd8137c38 skeletons: decorelate %token-table from verbose error messages
Reported by Adrian Vogelsgesang.

* data/skeletons/bison.m4: Here.
* data/skeletons/lalr1.cc: Adjust.
2020-02-10 07:24:38 +01:00
Adrian Vogelsgesang
bb02acc2ed style: stylistic cleanups in the C skeletons
* data/skeletons/glr.c, data/skeletons/yacc.c:
Avoid duplicated declaration of yysymbol_name.
2020-02-09 15:58:27 +01:00
Akim Demaille
80a4389377 java: provide Context with a more OO interface
* data/skeletons/lalr1.java (yyexpectedTokens)
(yysyntaxErrorArguments): Make them methods of Context.
(Context.yysymbolName): New.
* tests/local.at: Adjust.
2020-02-08 16:17:53 +01:00
Akim Demaille
ef097719ea java: add support for parse.error custom
* data/skeletons/lalr1.java: Add support for custom parse errors.
(yyntokens_): Make it public.  Under...
(yyntokens): this name.
(Context): Capture the location too.
* examples/c/bistromathic/parse.y,
* examples/c/bistromathic/bistromathic.test:
Improve error message.
* examples/java/calc/Calc.test, examples/java/calc/Calc.y: Use custom
error messages.
* tests/calc.at, tests/local.at: Check custom error messages.
2020-02-08 16:03:50 +01:00
Akim Demaille
0c90c59795 java: let the Context give access to yyntokens
* data/skeletons/lalr1.java (Context.yytokens): New.
2020-02-08 15:54:16 +01:00
Akim Demaille
18a7cfc7cf java: make the syntax error format string translatable
The error format should be translated, but contrary to the case of
C/C++, we cannot just depend on macros to adapt on the
presence/absence of '_'.  Let's consider that the message format is to
be translated iff there are some internationalized tokens.

* src/output.c (prepare_symbol_names): Define b4_has_translations.
* data/skeletons/java.m4 (b4_trans): New.
* data/skeletons/lalr1.java: Use it to emit translatable or not the
format string.
2020-02-08 11:24:53 +01:00
Akim Demaille
088356cb2f java: introduce yyexpectedTokens
And allow syntax error messages for 'detailed' and 'verbose' to be
translated.

* data/skeletons/lalr1.java (Context, yyexpectedTokens)
(yysyntaxErrorArguments): New.
(yysyntax_error): Use it.
2020-02-08 11:24:53 +01:00
Akim Demaille
52db24b2bc java: add support for parse.error=detailed
In Java there is no need for N_ and yytranslate_.  So instead of
hard-coding the use of N_ in the table of the symbol names, rely on
b4_symbol_translate.

* src/output.c (prepare_symbol_names): Use b4_symbol_translate instead
of N_.
* data/skeletons/c.m4 (b4_symbol_translate): New.
* data/skeletons/lalr1.java (yysymbolName): New.
Use it.
* examples/java/calc/Calc.y: Use parse.error=detailed.
* tests/calc.at: Check parse.error=detailed.
2020-02-08 11:24:53 +01:00
Akim Demaille
650b253843 m4: fix b4_token_format
We used to emit:

    /** Token number,to be returned by the scanner.  */
    static final int NUM = 258;
    /** Token number,to be returned by the scanner.  */
    static final int NEG = 259;

with no space after the comma.  Fix that.

* data/skeletons/bison.m4 (b4_token_format): Quote where appropriate.
2020-02-08 11:24:53 +01:00
Akim Demaille
ebab1ffca8 java: style: avoid useless initializers
Instead of

      /* YYPACT[STATE-NUM] -- Index in YYTABLE of the portion describing
       STATE-NUM.  */
      private static final byte yypact_[] = yypact_init ();
      private static final byte[] yypact_init ()
      {
        return new byte[]
        {
          25,    -7,    -8,    37,    -8,    40,    -8,    20,    -8,    61,
          -8,    -8,     3,     9,    51,    -8,    -8,    -2,    -2,    -2,
          -2,    -2,    -2,    -8,    -8,    -8,     1,    66,    66,     3,
           3,     3
        };
      }

generate

    /* YYPACT[STATE-NUM] -- Index in YYTABLE of the portion describing
       STATE-NUM.  */
      private static final byte[] yypact_ =
      {
          25,    -7,    -8,    37,    -8,    40,    -8,    20,    -8,    61,
          -8,    -8,     3,     9,    51,    -8,    -8,    -2,    -2,    -2,
          -2,    -2,    -2,    -8,    -8,    -8,     1,    66,    66,     3,
           3,     3
      };

I have no idea what motivated the previous approach.

* data/skeletons/java.m4 (b4_typed_parser_table_define): Here.
2020-02-05 13:17:00 +01:00
Akim Demaille
f705e9abdb java: style: prefer putting the square brackets on the type
* examples/java/calc/Calc.y, examples/java/simple/Calc.y,
* tests/calc.at, tests/local.at: here.
2020-02-05 13:17:00 +01:00
Akim Demaille
d727e0ff23 traces: don't print the stack before the gotos
The C, C++ and D skeletons used to show the stack right after popping
the stack during the reduction.  Now that the stack is printed after
reaching a new state, that has become useless:

    Entering state 1
    Stack now 0 1
    Reducing stack by rule 5 (line 83):
       $1 = token "number" (1)
    -> $$ = nterm exp (1)
    Stack now 0
    Entering state 8
    Stack now 0 8

Remove the "Stack now 0" line.

* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c:
Here.
2020-02-05 07:40:07 +01:00
Akim Demaille
37aeda6fb3 traces: show the stack after reading a token
Currently, if we have long rules and series of shift, we stack states
without showing stack.  Let's be more incremental, and do how the Java
skeleton does.

* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c:
Here.
Adjust test cases.
* tests/torture.at (AT_DATA_STACK_TORTURE): Disable stack traces: this
test produces a very large stack, and showing the stack each time we
shift a token goes quadatric.
2020-02-05 06:48:42 +01:00
Akim Demaille
bba2f0a3a0 traces: write the "Reading a token" alone on its line
The Java skeleton displays

    Reading a token:
    Next token is token "number" (1)

while the other display

    Reading a token: Next token is token "number" (1)

When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in.  So let's propagate the Java way, but with the colon.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
2020-02-04 07:02:24 +01:00
Akim Demaille
fe14fb1c40 java: use the same calc tests as the other skeletons
* tests/local.at (AT_LANG_MATCH): New.
(AT_YYERROR_DECLARE(java), AT_YYERROR_DECLARE_EXTERN(java)): New.
* tests/calc.at: The grammar file for Java is quite different for the
others, and continuing to assemble it from pieces makes the grammar
file hard to understand.  Let's also dispatch on the language to
assemble it, and isolate Java from the others.
Most of this comes from java.at.
2020-02-02 11:33:16 +01:00
Akim Demaille
88d1ab421a java: add access to the number of errors
* data/skeletons/lalr1.java (yynewrrs, getNumberOfErrors): New.
Formatting changes.
2020-02-02 11:33:16 +01:00
Akim Demaille
edf495b38e java: formatting changes
* data/skeletons/java.m4, data/skeletons/lalr1.java: here.
2020-02-02 11:33:16 +01:00
Akim Demaille
ba69beafac java: avoid trailing white spaces
* data/skeletons/java.m4 (b4_maybe_throws): Issue a space before when needed.
* data/skeletons/lalr1.java: Avoid trailing spaces.
2020-02-02 11:33:16 +01:00
Akim Demaille
0774b2c6e3 skeletons: add support for %code epilogue
When building the test cases, emitting code in the epilogue is very
constraining.  Let's make it simpler thanks to %code epilogue.

However, I don't want to document this: it is bad style to use it (we
should avoid having too many ways to write the same thing,
TI!MTOWTDI), just put your code in the true epilogue section.

* data/skeletons/glr.c, data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c: Implement support for %code epilogue.
Remove useless comments.
* tests/calc.at, tests/java.at: Simplify.
2020-02-02 11:28:45 +01:00
Akim Demaille
792fc34016 glr.c: add support for parse.error=custom
* data/skeletons/glr.c (yyreportSyntaxError): Call the user's
yyreport_syntax_error in custom mode.
* tests/calc.at: Check it.
2020-01-29 19:48:16 +01:00
Akim Demaille
c4a08d1899 glr.c: add support for parse.error=detailed
* data/skeletons/glr.c (yystrlen, yysymbol_name): New.
Implement parse.error detailed.
* tests/calc.at: Check it.
2020-01-29 19:48:12 +01:00
Akim Demaille
50fb1e6e0c glr.c: introduce yyexpected_tokens and yysyntax_error_arguments
Modeled after what was done in yacc.c, yet somewhat different since
yyGLRStack play the role of the full yyparse_context_t.  It will
probably be defined as a alias, to make interfaces compatible.

LAC is not supported here.  And is somewhat tricky.

* data/skeletons/glr.c (yyexpected_tokens, yysyntax_error_arguments): New.
2020-01-29 19:27:42 +01:00
Akim Demaille
6bd0527a50 glr.c: move code around
* data/skeletons/glr.c: Move the handling of error messages after the
definition of types.  Especially after yyGLRStack, which we will need
in forthcoming patches.
Simplify: yyGLRStack is defined at this point.
2020-01-29 19:27:24 +01:00
Akim Demaille
1aa9d40b24 glr.c: rename yyStateNum as yy_state_t
* data/skeletons/glr.c: here.
For consistency with yacc.c.
2020-01-29 07:15:19 +01:00
Akim Demaille
a1da9b9fe8 yacc.c: fix misleading indentation
* data/skeletons/yacc.c: here.
2020-01-29 07:15:19 +01:00
Akim Demaille
0f792833c2 style: formatting changes
* data/skeletons/lalr1.java: here.
2020-01-26 13:29:19 +01:00
Akim Demaille
46ab1d0cbe diagnostics: report syntax errors in color
* src/parse-gram.y (parse.error): Set to 'custom'.
(yyreport_syntax_error): New.
* data/bison-default.css (.expected, .unexpected): New.
* tests/diagnostics.at: Adjust.
2020-01-23 08:26:33 +01:00
Akim Demaille
16c77b4ba2 yacc.c: fixes
* data/skeletons/yacc.c: Avoid warnings about unused functions.
Fix typo.
2020-01-22 22:31:07 +01:00
Adrian Vogelsgesang
4ab2cf7450 larlr1.cc: Reject unsupported values for parse.lac
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.

* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
2020-01-21 06:57:21 +01:00
Adrian Vogelsgesang
172f103c1e larlr1.cc: Reject unsupported values for parse.lac
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.

* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
2020-01-21 06:22:27 +01:00
Akim Demaille
9096955fba parsers: support translatable token aliases
In addition to

    %token NUM "number"

accept

    %token NUM _("number")

in which case the token will be translated in error messages.
Do not use _() in the output if there are no translatable tokens.

* src/symtab.h, src/symtab.c (symbol): Add a 'translatable' member.
* src/parse-gram.y (TSTRING): New token.
(string_as_id.opt): Replace with...
(alias): this.
Use it.
* src/scan-gram.l (SC_ESCAPED_TSTRING): New start conditions, to match
TSTRINGs.
* src/output.c (prepare_symbols): Define b4_translatable if there are
translatable strings.

* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c (yytnamerr): Receive b4_translatable, and use it.
2020-01-19 21:23:11 +01:00
Akim Demaille
f443673450 yacc.c: add support for parse.error detailed
"detailed" error messages are almost like "verbose", except that we
don't double escape them, they don't get inner quotes, we don't use
yytnamerr, and we hide the table.

"custom" is exposed with the "detailed" tokens, not the "verbose"
ones: they are not double-quoted.

Because there's a risk that some people use yytname even without
"verbose", let's keep yytname (instead of yys_name) in "simple"
parse.error.

* src/output.c (prepare_symbol_names): Be ready to output symbol names
unquoted.
(prepare_symbol_names): Output both the old tname table, and the new
symbol_names one.
* data/skeletons/bison.m4: Accept 'detailed'.
* data/skeletons/yacc.c: When parse.error is 'detailed', don't emit
yytname and yytnamerr, just yysymbol_name with the table inside.
* tests/calc.at: Adjust.
2020-01-19 14:51:14 +01:00