Commit Graph

1737 Commits

Author SHA1 Message Date
Akim Demaille
cfcd823e16 diagnostics: don't crash because of repeated definitions of error
According to https://www.unix.com/man-page/POSIX/1posix/yacc/, the
user is allowed to specify her user number for the error token:

    The token error shall be reserved for error handling. The name
    error can be used in grammar rules. It indicates places where the
    parser can recover from a syntax error. The default value of error
    shall be 256. Its value can be changed using a %token
    declaration. The lexical analyzer should not return the value of
    error.

I think this feature is useless, the user should not have to deal with
that.  The intend is probably to give the user a means to use 256 if
she wants to, but provided "error" cleared the path first by being
assigned another number.  In the case of Bison, 256 is assigned to
"error" at the end if the user did not use it for a token of hers.  So
this feature is useless.

Yet it is valid, and if the user assigns twice a token number to
"error", then the second time we want to complain about it and want to
show the original definition.  At this point, we try to display the
built-in definition of "error", whose location is NULL, and we crash.

Rather, the location of the first user definition of "error" should
become its defining location.

Reported byg Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00007.html

* src/symtab.c (symbol_class_set): If this is a declaration and the
symbol was not declared yet, keep this as defining location.
* tests/input.at (Redefining the error token): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
b638603477 diagnostics: beware of zero-width characters
Currenly we rely on (visual) width of the characters to decide where
to open and close the styling of the quoted lines.  This breaks when
we deal with zero-width characters: we cannot just rely on (visual)
columns, we need to know whether we are before, inside, or after the
highlighted portion.

* src/location.c (location_caret): col_end: no longer add 1, "regular"
characters have a width of 1, only 0-width characters have 0-width.
opened: replace with 'state', a three-valued enum.
Don't reopen the style if we already did.
* tests/diagnostics.at (Zero-width characters): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
e21ff47f5d diagnostics: be sure to close the styling when lines are too short
bar.y:4.12-17: <error>error:</error> redefining user token number of foo
    -    4 | %token foo <error>123
    +    4 | %token foo <error>123</error>
           |            <error>^~~~~~</error>

* src/location.c (location_caret): Be sure to close.
* tests/diagnostics.at (Line is too short, and then you die): New.
2020-03-07 10:01:52 +01:00
Akim Demaille
b82b387da9 muscles: fix incorrect decoding of $
Bug introduced in 458171e6df.
https://lists.gnu.org/archive/html/bison-patches/2013-11/msg00009.html

Reported by Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00010.html

* src/muscle-tab.c (COMMON_DECODE): "$" is coded as "$][", not "$[][".
* tests/input.at ("%define" enum variables): Check that case.
2020-03-07 07:45:10 +01:00
Akim Demaille
641e326303 code: be robust to reference with invalid tags
Because we want to support $<a->b>$, we must accept -> in type tags,
and reject $<->$, as it is unfinished.
Reported by Ahcheong Lee.

* src/scan-code.l (yylex): Make sure "tag" does not end with -, since
-> does not close the tag.
* tests/input.at (Stray $ or @): Check this.
2020-03-06 17:29:26 +01:00
Akim Demaille
4fd3282dd7 style: formatting changes
* data/skeletons/yacc.c, tests/torture.at: here.
2020-03-04 08:24:36 +01:00
Adrian Vogelsgesang
c2cca46795 c++: add support for parse.error=custom
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added test cases
* tests/local.at: added yyreport_syntax_error implementation
   for C++ test cases
2020-02-27 18:13:44 +01:00
Adrian Vogelsgesang
72acecb30c c++: add support for parse.error=detailed
* data/skeletons/lalr1.cc: added support here
* tests/calc.at: added a test case
2020-02-27 18:13:43 +01:00
Adrian Vogelsgesang
368fcf0af5 typo: succesful -> successful
* data/skeletons/lalr1.cc: here
* etc/bench.pl.in: here
* src/location.c: here
* tests/calc.at: and here
2020-02-27 18:10:39 +01:00
Victor Morales Cayuela
e09a72eeb0 diagnostics: modernize the display of submessages
Since Bison 2.7, output was indented four spaces for explanatory
statements.  For example:

    input.y:2.7-13: error: %type redeclaration for exp
    input.y:1.7-11:     previous declaration

Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:

    input.y:2.7-13: error: %type redeclaration for exp
        2 | %type <float> exp
          |       ^~~~~~~
    input.y:1.7-11: note: previous declaration
        1 | %type <int> exp
          |       ^~~~~

* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
2020-02-15 08:28:40 +01:00
Akim Demaille
f3d33c3613 tests: check calls to yyerror from the user actions
This revealed a number of things I had not realized:

- the Java location tracking was aliasing the same pair of positions
  for all the symbols (see previous commit).

- in impure parsers, it's quite easy to use incorrect locations for
  diagnostics, since yyerror uses yylloc, which is the location of the
  lookahead, not that of the current lhs.  So we need something like

    {
      YYLTYPE old_yylloc = yylloc;
      yylloc = @$;
      yyerror (]AT_PARAM_IF([result, count, nerrs, ])[buf);
      yylloc = old_yylloc;
    }

  Maybe we should do that little yylloc dance in the skeleton instead
  of leaving it to the user?  It might be costly...  But that's only
  for users of the impure parsers, which are asking for trouble
  anyway.

- in glr.cc invoking yyerror is somewhat cumbersome: the C++ interface
  is not available as we are in yyparse (which in C), and yyerror is
  used by glr.cc itself to bind it to the user's parser::error.  If we
  call yyerror, we need:

    yyerror (]AT_LOCATION_IF([[&@$, ]])[yyparser, ]AT_PARAM_IF([result, count, nerrs, ])[msg);

  However calling yy::parser::error is easier, once we know that the
  current parser object is available as 'yyparser'.  Which also saves
  us from having to pass the parse-params ourselves:

    yyparser.error (]AT_LOCATION_IF([[@$, ]])[msg);

* tests/calc.at: Invoke yyerror by hand, instead of using fprintf etc.
Adjust expectations.
2020-02-12 00:00:05 +01:00
Akim Demaille
163a35d6dd java: beware not to alias the locations of the various symbols
* examples/java/calc/Calc.y, tests/calc.at, tests/local.at
(getStartPos, getEndPos): Always return a new object.
* doc/bison.texi: Clarify this.
2020-02-11 21:31:44 +01:00
Akim Demaille
cdb42f7730 java: check that parse.error custom|detailed work with push parsers
* tests/calc.at: here.
2020-02-11 08:39:08 +01:00
Akim Demaille
126252333d java: don't expose the Context's members
* data/skeletons/lalr1.java (Context): Make data members private.
(Context.getLocation): New.
* examples/java/calc/Calc.y, tests/java.at, tests/local.at: Adjust.
2020-02-11 08:39:03 +01:00
Akim Demaille
77bdcc6f0c parse.error: document and diagnose the incompatibility with %token-table
* doc/bison.texi (Tokens from Literals): Move to code using
%token-table to...
(Decl Summary: %token-table): here.
* data/skeletons/bison.m4: Implement mutual exclusion.
* tests/input.at: Check it.
* doc/local.mk: Be robust to the removal of doc/.
2020-02-10 20:15:46 +01:00
Akim Demaille
6f5465c917 doc: clearly state that %yacc only makes sense with yacc.c
* doc/bison.texi: here.
* tests/calc.at: Stop testing %yacc with non yacc.c skeletons.
2020-02-09 15:58:55 +01:00
Akim Demaille
80a4389377 java: provide Context with a more OO interface
* data/skeletons/lalr1.java (yyexpectedTokens)
(yysyntaxErrorArguments): Make them methods of Context.
(Context.yysymbolName): New.
* tests/local.at: Adjust.
2020-02-08 16:17:53 +01:00
Akim Demaille
ef097719ea java: add support for parse.error custom
* data/skeletons/lalr1.java: Add support for custom parse errors.
(yyntokens_): Make it public.  Under...
(yyntokens): this name.
(Context): Capture the location too.
* examples/c/bistromathic/parse.y,
* examples/c/bistromathic/bistromathic.test:
Improve error message.
* examples/java/calc/Calc.test, examples/java/calc/Calc.y: Use custom
error messages.
* tests/calc.at, tests/local.at: Check custom error messages.
2020-02-08 16:03:50 +01:00
Akim Demaille
52db24b2bc java: add support for parse.error=detailed
In Java there is no need for N_ and yytranslate_.  So instead of
hard-coding the use of N_ in the table of the symbol names, rely on
b4_symbol_translate.

* src/output.c (prepare_symbol_names): Use b4_symbol_translate instead
of N_.
* data/skeletons/c.m4 (b4_symbol_translate): New.
* data/skeletons/lalr1.java (yysymbolName): New.
Use it.
* examples/java/calc/Calc.y: Use parse.error=detailed.
* tests/calc.at: Check parse.error=detailed.
2020-02-08 11:24:53 +01:00
Akim Demaille
7781254e01 java: tests: remove now redundant tests
* tests/javapush.at: here.
2020-02-05 13:17:00 +01:00
Akim Demaille
fa226d773c java: tests: check push parsers like the others
Currently in javapush.at.

* tests/calc.at: Here.
2020-02-05 13:17:00 +01:00
Akim Demaille
1cc83505c5 java: tests: remove now redundant tests
* tests/java.at: Calculator tests are now in calc.at.
2020-02-05 13:17:00 +01:00
Akim Demaille
2d97fe86fd java: tests: check location tracking in the calculator
Unfortunately in the Java skeleton the user cannot override the way
locations are displayed, and locations don't know the structure of the
positions.  So they cannot implement the tricks used in the C/C++
skeletons to display "1.1" instead of "1.1-1.2".

* tests/local.at (Java): Add support for column tracking in the
locations, as we did in examples/java/calc.
* tests/calc.at: Use AT_CALC_YYLEX.
2020-02-05 13:17:00 +01:00
Akim Demaille
3239866f4a java: tests: prepare the replacement of calculator tests
Soon calculator tests for Java will move from java.at to calc.at.
Which implies improving the Java testing infrastructure in
local.at (for instance really tracking columns in positions, not just
token number).  Detach java.at from local.at.

* tests/java.at (AT_JAVA_POSITION_DEFINE_OLD): New.
Use it.
2020-02-05 13:17:00 +01:00
Akim Demaille
f705e9abdb java: style: prefer putting the square brackets on the type
* examples/java/calc/Calc.y, examples/java/simple/Calc.y,
* tests/calc.at, tests/local.at: here.
2020-02-05 13:17:00 +01:00
Akim Demaille
d727e0ff23 traces: don't print the stack before the gotos
The C, C++ and D skeletons used to show the stack right after popping
the stack during the reduction.  Now that the stack is printed after
reaching a new state, that has become useless:

    Entering state 1
    Stack now 0 1
    Reducing stack by rule 5 (line 83):
       $1 = token "number" (1)
    -> $$ = nterm exp (1)
    Stack now 0
    Entering state 8
    Stack now 0 8

Remove the "Stack now 0" line.

* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c:
Here.
2020-02-05 07:40:07 +01:00
Akim Demaille
37aeda6fb3 traces: show the stack after reading a token
Currently, if we have long rules and series of shift, we stack states
without showing stack.  Let's be more incremental, and do how the Java
skeleton does.

* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c:
Here.
Adjust test cases.
* tests/torture.at (AT_DATA_STACK_TORTURE): Disable stack traces: this
test produces a very large stack, and showing the stack each time we
shift a token goes quadatric.
2020-02-05 06:48:42 +01:00
Akim Demaille
bba2f0a3a0 traces: write the "Reading a token" alone on its line
The Java skeleton displays

    Reading a token:
    Next token is token "number" (1)

while the other display

    Reading a token: Next token is token "number" (1)

When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in.  So let's propagate the Java way, but with the colon.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
2020-02-04 07:02:24 +01:00
Akim Demaille
fe14fb1c40 java: use the same calc tests as the other skeletons
* tests/local.at (AT_LANG_MATCH): New.
(AT_YYERROR_DECLARE(java), AT_YYERROR_DECLARE_EXTERN(java)): New.
* tests/calc.at: The grammar file for Java is quite different for the
others, and continuing to assemble it from pieces makes the grammar
file hard to understand.  Let's also dispatch on the language to
assemble it, and isolate Java from the others.
Most of this comes from java.at.
2020-02-02 11:33:16 +01:00
Akim Demaille
edf495b38e java: formatting changes
* data/skeletons/java.m4, data/skeletons/lalr1.java: here.
2020-02-02 11:33:16 +01:00
Akim Demaille
d5f929d407 java: example: rely on autoboxing
AFAICT, autoboxing/unboxing was added in Java 5 (September 30, 2004).
I think we can afford to use it.  It should help us merge some Java
tests with the main ones.

However, beware that != does not unbox: it compares the object
addresses.

* examples/java/Calc.y, tests/java.at: Simplify.
* examples/java/Calc.test, tests/java.at: Improve tests.
2020-02-02 11:32:55 +01:00
Akim Demaille
c5b215b5e6 tests: comment changes
* tests/calc.at: Shorten titles and reduce redundancy.
2020-02-02 11:28:45 +01:00
Akim Demaille
0774b2c6e3 skeletons: add support for %code epilogue
When building the test cases, emitting code in the epilogue is very
constraining.  Let's make it simpler thanks to %code epilogue.

However, I don't want to document this: it is bad style to use it (we
should avoid having too many ways to write the same thing,
TI!MTOWTDI), just put your code in the true epilogue section.

* data/skeletons/glr.c, data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c: Implement support for %code epilogue.
Remove useless comments.
* tests/calc.at, tests/java.at: Simplify.
2020-02-02 11:28:45 +01:00
Akim Demaille
792fc34016 glr.c: add support for parse.error=custom
* data/skeletons/glr.c (yyreportSyntaxError): Call the user's
yyreport_syntax_error in custom mode.
* tests/calc.at: Check it.
2020-01-29 19:48:16 +01:00
Akim Demaille
c4a08d1899 glr.c: add support for parse.error=detailed
* data/skeletons/glr.c (yystrlen, yysymbol_name): New.
Implement parse.error detailed.
* tests/calc.at: Check it.
2020-01-29 19:48:12 +01:00
Akim Demaille
0917f4dc76 tests: check custom error messages and push parsers
* tests/local.at (AT_LAC_IF): New.
* tests/calc.at: And also check the suppot for LAC.
2020-01-26 13:29:19 +01:00
Akim Demaille
fc2191f137 diagnostics: modernize bison's syntax errors
We used to display the unexpected token first:

    $ bison foo.y
    foo.y:1.8-13: error: syntax error, unexpected %token, expecting character literal or identifier or <tag>
        1 | %token %token
          |        ^~~~~~

GCC uses a different format:

    $ gcc-mp-9 foo.c
    foo.c:1:5: error: expected identifier or '(' before ')' token
        1 | int()()()
          |     ^

and so does Clang:

    $ clang-mp-9.0 foo.c
    foo.c:1:5: error: expected identifier or '('
    int()()()
        ^
    1 error generated.

They display the unexpected token last (or not at all).  Also, they
don't waste width with "syntax error".  Let's try that.  It gives, for
the same example as above:

    $ bison foo.y
    foo.y:1.8-13: error: expected character literal or identifier or <tag> before %token
        1 | %token %token
          |        ^~~~~~

* src/complain.h, src/complain.c (syntax_error): New.
* src/parse-gram.y (yyreport_syntax_error): Use it.
2020-01-23 08:30:28 +01:00
Akim Demaille
46ab1d0cbe diagnostics: report syntax errors in color
* src/parse-gram.y (parse.error): Set to 'custom'.
(yyreport_syntax_error): New.
* data/bison-default.css (.expected, .unexpected): New.
* tests/diagnostics.at: Adjust.
2020-01-23 08:26:33 +01:00
Akim Demaille
2cc361387c diagnostics: translate bison's own tokens
As a test case, support translations in Bison itself.

* src/parse-gram.y: Mark the translatable tokens.
While at it, use clearer names.
* tests/input.at: Adjust expectations.
2020-01-23 08:26:28 +01:00
Adrian Vogelsgesang
4ab2cf7450 larlr1.cc: Reject unsupported values for parse.lac
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.

* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
2020-01-21 06:57:21 +01:00
Adrian Vogelsgesang
172f103c1e larlr1.cc: Reject unsupported values for parse.lac
Just as the yacc.c skeleton, the lalr1.cc skeleton should reject
invalid values for parse.lac.

* data/skeletons/lalr1.cc: check validity of parse.lac
* tests/input.at: new test cases
2020-01-21 06:22:27 +01:00
Akim Demaille
6ada985ff3 parsers: issue tname with i18n markup
Some users would like to avoid having to "parse" the *.y file to find
the strings to translate.  Let's issue the translatable tokens with N_
to allow "parsing" the generated parsers instead.

See
https://lists.gnu.org/archive/html/bison-patches/2019-01/msg00015.html

* src/output.c (prepare_symbol_names): Issue symbol_names with N_()
markup.
2020-01-19 21:23:11 +01:00
Akim Demaille
2e12257803 tests: check token internationalization
* tests/calc.at: Check it.
2020-01-19 21:23:11 +01:00
Akim Demaille
e9d404415a tests: check that detailed error messages preserve UTF-8 characters
* tests/regression.at: here.
2020-01-19 21:23:03 +01:00
Akim Demaille
d9df62bfcd yacc.c: escape trigraphs in detailed parse.error
* src/output.c (escape_trigraphs, xescape_trigraphs): New.
(prepare_symbol_names): Use it.
* tests/regression.at: Check the handling of trigraphs with
parse.error = detailed.
2020-01-19 21:22:41 +01:00
Akim Demaille
91247f50d7 yacc.c: tests: check detailed error messages
* tests/local.at (AT_ERROR_DETAILED_IF): New.
(AT_ERROR_SIMPLE_IF): Adjust.
* tests/calc.at: Check parse.error=detailed.
2020-01-19 14:51:14 +01:00
Akim Demaille
f443673450 yacc.c: add support for parse.error detailed
"detailed" error messages are almost like "verbose", except that we
don't double escape them, they don't get inner quotes, we don't use
yytnamerr, and we hide the table.

"custom" is exposed with the "detailed" tokens, not the "verbose"
ones: they are not double-quoted.

Because there's a risk that some people use yytname even without
"verbose", let's keep yytname (instead of yys_name) in "simple"
parse.error.

* src/output.c (prepare_symbol_names): Be ready to output symbol names
unquoted.
(prepare_symbol_names): Output both the old tname table, and the new
symbol_names one.
* data/skeletons/bison.m4: Accept 'detailed'.
* data/skeletons/yacc.c: When parse.error is 'detailed', don't emit
yytname and yytnamerr, just yysymbol_name with the table inside.
* tests/calc.at: Adjust.
2020-01-19 14:51:14 +01:00
Akim Demaille
ebe427bbf3 Merge branch 'maint'
* maint:
  maint: post-release administrivia
  version 3.5.1
  news: update
  CI: use ICC again
  warnings: pacify ICC in lalr1.cc
  test: report.at: avoid tiny new failure
  git: update ignores
2020-01-19 14:50:09 +01:00
Jim Meyering
27e822abfd test: report.at: avoid tiny new failure
Be robust to newer versions of Autoconf where the package URL defaults
to https instead of http.

* configure.ac (AC_INIT): Use https.
* tests/report.at: Adjust expected output s/http/https/
to match updated URL.
2020-01-19 10:03:01 +01:00
Akim Demaille
e1197fcc3d yacc.c: portability to G++ 4.8
Currently we get warnings with GCC 4.8 when running the
maintainer-check-g++ tests:

    143. skeletons.at:85: testing Installed skeleton file names ...
    ../../tests/skeletons.at:120: COLUMNS=1000; export COLUMNS;  bison --color=no -fno-caret --skeleton=yacc.c -o input-cmd-line.c input-cmd-line.y
    ../../tests/skeletons.at:121: $CC $CFLAGS $CPPFLAGS  $LDFLAGS -o input-cmd-line input-cmd-line.c $LIBS
    stderr:
    input-cmd-line.c: In function 'int yysyntax_error(long int*, char**, const yyparse_context_t*)':
    input-cmd-line.c:977:52: error: conversion to 'int' from 'long int' may alter its value [-Werror=conversion]
                                       YYSIZEOF (yyarg) / YYSIZEOF (*yyarg));
                                                        ^
    cc1plus: all warnings being treated as errors
    stdout:
    ../../tests/skeletons.at:121: exit code was 1, expected 0

and

    429. calc.at:823: testing Calculator parse.error=custom %locations api.prefix={calc}  ...
    ../../tests/calc.at:823: COLUMNS=1000; export COLUMNS;  bison --color=no -fno-caret -Wno-deprecated -o calc.c calc.y
    ../../tests/calc.at:823: $CC $CFLAGS $CPPFLAGS  $LDFLAGS -o calc calc.c $LIBS
    stderr:
    calc.y: In function 'int yyreport_syntax_error(const yyparse_context_t*)':
    calc.y:157:58: error: conversion to 'int' from 'long unsigned int' may alter its value [-Werror=conversion]
       int n = yysyntax_error_arguments (ctx, arg, sizeof arg / sizeof *arg);
                                                              ^
    cc1plus: all warnings being treated as errors
    stdout:
    ../../tests/calc.at:823: exit code was 1, expected 0

We could use a cast to avoid the warning, but it becomes too
cluttered.  We can also use YYPTRDIFF_T, but that forces the user to
use YYPTRDIFF_T too, although this is an array of tokens, which is
limited by YYNTOKENS, an int.  So let's completely avoid this warning.

* data/skeletons/yacc.c, tests/local.at (yyreport_syntax_error): Avoid
relying on sizeof to compute the array capacity.
2020-01-17 06:49:59 +01:00