2782 Commits

Author SHA1 Message Date
Akim Demaille
1079595b2a style: reduce length of private constant
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c
(YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as...
(YYARGS_MAX): this.
* src/parse-gram.y (YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as...
(ARGS_MAX): this.
2020-03-23 07:02:34 +01:00
Akim Demaille
466fb66578 regen 2020-03-17 19:21:24 +01:00
Akim Demaille
951da960e6 merge branch 'maint'
* upstream/maint:
  maint: post-release administrivia
  version 3.5.3
  news: update for 3.5.3
  yacc.c: make sure we properly propagated the user's number for error
  diagnostics: don't crash because of repeated definitions of error
  style: initialize some struct members
  diagnostics: beware of zero-width characters
  diagnostics: be sure to close the styling when lines are too short
  muscles: fix incorrect decoding of $
  code: be robust to reference with invalid tags
  build: fix typo
  doc: update recommandation for libtextstyle
  style: comment changes
  examples: use consistently the GFDL header for readmes
  style: remove useless declarations
  typo: succesful -> successful
  README: point to tests/bison, and document --trace
  gnulib: update
  maint: post-release administrivia
2020-03-08 10:13:16 +01:00
Akim Demaille
cfcd823e16 diagnostics: don't crash because of repeated definitions of error
According to https://www.unix.com/man-page/POSIX/1posix/yacc/, the
user is allowed to specify her user number for the error token:

    The token error shall be reserved for error handling. The name
    error can be used in grammar rules. It indicates places where the
    parser can recover from a syntax error. The default value of error
    shall be 256. Its value can be changed using a %token
    declaration. The lexical analyzer should not return the value of
    error.

I think this feature is useless, the user should not have to deal with
that.  The intend is probably to give the user a means to use 256 if
she wants to, but provided "error" cleared the path first by being
assigned another number.  In the case of Bison, 256 is assigned to
"error" at the end if the user did not use it for a token of hers.  So
this feature is useless.

Yet it is valid, and if the user assigns twice a token number to
"error", then the second time we want to complain about it and want to
show the original definition.  At this point, we try to display the
built-in definition of "error", whose location is NULL, and we crash.

Rather, the location of the first user definition of "error" should
become its defining location.

Reported byg Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00007.html

* src/symtab.c (symbol_class_set): If this is a declaration and the
symbol was not declared yet, keep this as defining location.
* tests/input.at (Redefining the error token): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
2f02d9beae style: initialize some struct members
* src/symtab.c (sym_content_new): Initialize all the location members.
Not needed by the code, but disturbing values when using a debugger.
2020-03-08 08:10:11 +01:00
Akim Demaille
b638603477 diagnostics: beware of zero-width characters
Currenly we rely on (visual) width of the characters to decide where
to open and close the styling of the quoted lines.  This breaks when
we deal with zero-width characters: we cannot just rely on (visual)
columns, we need to know whether we are before, inside, or after the
highlighted portion.

* src/location.c (location_caret): col_end: no longer add 1, "regular"
characters have a width of 1, only 0-width characters have 0-width.
opened: replace with 'state', a three-valued enum.
Don't reopen the style if we already did.
* tests/diagnostics.at (Zero-width characters): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
e21ff47f5d diagnostics: be sure to close the styling when lines are too short
bar.y:4.12-17: <error>error:</error> redefining user token number of foo
    -    4 | %token foo <error>123
    +    4 | %token foo <error>123</error>
           |            <error>^~~~~~</error>

* src/location.c (location_caret): Be sure to close.
* tests/diagnostics.at (Line is too short, and then you die): New.
2020-03-07 10:01:52 +01:00
Akim Demaille
b82b387da9 muscles: fix incorrect decoding of $
Bug introduced in 458171e6df.
https://lists.gnu.org/archive/html/bison-patches/2013-11/msg00009.html

Reported by Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00010.html

* src/muscle-tab.c (COMMON_DECODE): "$" is coded as "$][", not "$[][".
* tests/input.at ("%define" enum variables): Check that case.
2020-03-07 07:45:10 +01:00
Akim Demaille
641e326303 code: be robust to reference with invalid tags
Because we want to support $<a->b>$, we must accept -> in type tags,
and reject $<->$, as it is unfinished.
Reported by Ahcheong Lee.

* src/scan-code.l (yylex): Make sure "tag" does not end with -, since
-> does not close the tag.
* tests/input.at (Stray $ or @): Check this.
2020-03-06 17:29:26 +01:00
Akim Demaille
666df338a7 style: comment changes
* src/symtab.h, src/lr0.c: here.
2020-03-06 08:32:03 +01:00
Akim Demaille
b493c173c9 style: remove useless declarations
* src/reader.h: Don't duplicate what parse-gram.h already exposes.
* src/lr0.h: Remove useless include.
2020-03-06 08:30:21 +01:00
Adrian Vogelsgesang
aab3feb5a1 typo: succesful -> successful
* data/skeletons/lalr1.cc: here
* etc/bench.pl.in: here
* src/location.c: and here.
2020-03-06 08:29:58 +01:00
Akim Demaille
30d01b21e7 c++: minor fixes
Address compiler warnings such as

    warning: declaration of 'yyla' shadows a member of 'yy::parser::context' [-Wshadow]

* data/skeletons/lalr1.cc (context): Don't use the same names for
variables and members.
Use foo_ for private members, as in parser.
Also, use the + trick in array accesses to please ICC and provide it
with an int.
2020-02-27 21:31:09 +01:00
Adrian Vogelsgesang
368fcf0af5 typo: succesful -> successful
* data/skeletons/lalr1.cc: here
* etc/bench.pl.in: here
* src/location.c: here
* tests/calc.at: and here
2020-02-27 18:10:39 +01:00
Akim Demaille
296660304c style: comment changes
* src/symtab.h, src/lr0.c: here.
2020-02-23 08:25:53 +01:00
Akim Demaille
323731ff74 style: avoid using 'this' as an identifier
LLDB insists on parsing 'this' as a C++ keyword, even when debugging a
C program.

* src/symtab.c: Please the dictator.
2020-02-23 08:25:53 +01:00
Akim Demaille
6f7f949708 style: remove useless declarations
* src/reader.h: Don't duplicate what parse-gram.h already exposes.
* src/lr0.h: Remove useless include.
2020-02-23 08:25:53 +01:00
Akim Demaille
7a28659495 regen 2020-02-15 08:28:57 +01:00
Victor Morales Cayuela
e09a72eeb0 diagnostics: modernize the display of submessages
Since Bison 2.7, output was indented four spaces for explanatory
statements.  For example:

    input.y:2.7-13: error: %type redeclaration for exp
    input.y:1.7-11:     previous declaration

Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:

    input.y:2.7-13: error: %type redeclaration for exp
        2 | %type <float> exp
          |       ^~~~~~~
    input.y:1.7-11: note: previous declaration
        1 | %type <int> exp
          |       ^~~~~

* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
2020-02-15 08:28:40 +01:00
Akim Demaille
cc3760ef51 news: 3.5.2
* NEWS: Update.
2020-02-13 18:25:11 +01:00
Akim Demaille
8637f2c7d6 build: pacify syntax-check
* src/complain.c: Fix indentation.
* cfg.mk: Using strcmp is ok in the tests.
Test cases and examples don't need Bison's PO support.
2020-02-10 20:46:56 +01:00
Akim Demaille
6946149701 regen 2020-02-10 20:42:23 +01:00
Akim Demaille
18a7cfc7cf java: make the syntax error format string translatable
The error format should be translated, but contrary to the case of
C/C++, we cannot just depend on macros to adapt on the
presence/absence of '_'.  Let's consider that the message format is to
be translated iff there are some internationalized tokens.

* src/output.c (prepare_symbol_names): Define b4_has_translations.
* data/skeletons/java.m4 (b4_trans): New.
* data/skeletons/lalr1.java: Use it to emit translatable or not the
format string.
2020-02-08 11:24:53 +01:00
Akim Demaille
52db24b2bc java: add support for parse.error=detailed
In Java there is no need for N_ and yytranslate_.  So instead of
hard-coding the use of N_ in the table of the symbol names, rely on
b4_symbol_translate.

* src/output.c (prepare_symbol_names): Use b4_symbol_translate instead
of N_.
* data/skeletons/c.m4 (b4_symbol_translate): New.
* data/skeletons/lalr1.java (yysymbolName): New.
Use it.
* examples/java/calc/Calc.y: Use parse.error=detailed.
* tests/calc.at: Check parse.error=detailed.
2020-02-08 11:24:53 +01:00
Akim Demaille
e6b0612f91 bison: pretend to 3.6 already
* src/parse-gram.y: here.
2020-01-26 13:29:18 +01:00
Akim Demaille
81849520cd regen 2020-01-23 08:30:51 +01:00
Akim Demaille
fc2191f137 diagnostics: modernize bison's syntax errors
We used to display the unexpected token first:

    $ bison foo.y
    foo.y:1.8-13: error: syntax error, unexpected %token, expecting character literal or identifier or <tag>
        1 | %token %token
          |        ^~~~~~

GCC uses a different format:

    $ gcc-mp-9 foo.c
    foo.c:1:5: error: expected identifier or '(' before ')' token
        1 | int()()()
          |     ^

and so does Clang:

    $ clang-mp-9.0 foo.c
    foo.c:1:5: error: expected identifier or '('
    int()()()
        ^
    1 error generated.

They display the unexpected token last (or not at all).  Also, they
don't waste width with "syntax error".  Let's try that.  It gives, for
the same example as above:

    $ bison foo.y
    foo.y:1.8-13: error: expected character literal or identifier or <tag> before %token
        1 | %token %token
          |        ^~~~~~

* src/complain.h, src/complain.c (syntax_error): New.
* src/parse-gram.y (yyreport_syntax_error): Use it.
2020-01-23 08:30:28 +01:00
Akim Demaille
6fb362c87a regen 2020-01-23 08:26:33 +01:00
Akim Demaille
46ab1d0cbe diagnostics: report syntax errors in color
* src/parse-gram.y (parse.error): Set to 'custom'.
(yyreport_syntax_error): New.
* data/bison-default.css (.expected, .unexpected): New.
* tests/diagnostics.at: Adjust.
2020-01-23 08:26:33 +01:00
Akim Demaille
f54a5b303b regen 2020-01-23 08:26:33 +01:00
Akim Demaille
2cc361387c diagnostics: translate bison's own tokens
As a test case, support translations in Bison itself.

* src/parse-gram.y: Mark the translatable tokens.
While at it, use clearer names.
* tests/input.at: Adjust expectations.
2020-01-23 08:26:28 +01:00
Akim Demaille
e6d1289f4a diagnostics: handle -fno-caret in the called functions
Don't force callers of location_caret to have to deal with flags that
disable it.

* src/location.h, src/location.c (location_caret)
(location_caret_suggestion): Early return if disabled.
* src/complain.c: Simplify.
2020-01-22 22:31:41 +01:00
Akim Demaille
6ada985ff3 parsers: issue tname with i18n markup
Some users would like to avoid having to "parse" the *.y file to find
the strings to translate.  Let's issue the translatable tokens with N_
to allow "parsing" the generated parsers instead.

See
https://lists.gnu.org/archive/html/bison-patches/2019-01/msg00015.html

* src/output.c (prepare_symbol_names): Issue symbol_names with N_()
markup.
2020-01-19 21:23:11 +01:00
Akim Demaille
1db962716a regen 2020-01-19 21:23:11 +01:00
Akim Demaille
9096955fba parsers: support translatable token aliases
In addition to

    %token NUM "number"

accept

    %token NUM _("number")

in which case the token will be translated in error messages.
Do not use _() in the output if there are no translatable tokens.

* src/symtab.h, src/symtab.c (symbol): Add a 'translatable' member.
* src/parse-gram.y (TSTRING): New token.
(string_as_id.opt): Replace with...
(alias): this.
Use it.
* src/scan-gram.l (SC_ESCAPED_TSTRING): New start conditions, to match
TSTRINGs.
* src/output.c (prepare_symbols): Define b4_translatable if there are
translatable strings.

* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c (yytnamerr): Receive b4_translatable, and use it.
2020-01-19 21:23:11 +01:00
Akim Demaille
d9df62bfcd yacc.c: escape trigraphs in detailed parse.error
* src/output.c (escape_trigraphs, xescape_trigraphs): New.
(prepare_symbol_names): Use it.
* tests/regression.at: Check the handling of trigraphs with
parse.error = detailed.
2020-01-19 21:22:41 +01:00
Akim Demaille
adac9a17f0 regen 2020-01-19 14:51:14 +01:00
Akim Demaille
3b4b157369 bison: use detailed error messages
* #: .
2020-01-19 14:51:14 +01:00
Akim Demaille
5c332d70d7 regen 2020-01-19 14:51:14 +01:00
Akim Demaille
f443673450 yacc.c: add support for parse.error detailed
"detailed" error messages are almost like "verbose", except that we
don't double escape them, they don't get inner quotes, we don't use
yytnamerr, and we hide the table.

"custom" is exposed with the "detailed" tokens, not the "verbose"
ones: they are not double-quoted.

Because there's a risk that some people use yytname even without
"verbose", let's keep yytname (instead of yys_name) in "simple"
parse.error.

* src/output.c (prepare_symbol_names): Be ready to output symbol names
unquoted.
(prepare_symbol_names): Output both the old tname table, and the new
symbol_names one.
* data/skeletons/bison.m4: Accept 'detailed'.
* data/skeletons/yacc.c: When parse.error is 'detailed', don't emit
yytname and yytnamerr, just yysymbol_name with the table inside.
* tests/calc.at: Adjust.
2020-01-19 14:51:14 +01:00
Akim Demaille
8e6233353f c: use yysymbol_name in traces
Only parse.error verbose and simple will get the original yytname: the
other options will rely on a different table.  So let's move on top of
the yysymbol_name function.

* data/skeletons/c.m4 (yy_symbol_print): Use yysymbol_name.
* data/skeletons/glr.c (yytokenName): Rename as...
(yysymbol_name): this.
The change of naming scheme is unfortunate, but it's definitely glr.c
which is "wrong".
2020-01-19 14:51:14 +01:00
Akim Demaille
c6e2dcabae regen 2020-01-17 06:51:24 +01:00
Akim Demaille
a1aef9825a regen 2020-01-17 06:49:45 +01:00
Akim Demaille
a0675d707f Merge branch 'maint' into HEAD
* maint:
  gnulib: update
  lalr1.cc: avoid static_cast
  glr.c: add missing cast
  regen
  package: bump copyrights to 2020
  gitignore: update
2020-01-11 07:38:39 +01:00
Akim Demaille
04f3bfc596 regen 2020-01-10 19:21:35 +01:00
Akim Demaille
c67daa9a97 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-10 19:16:23 +01:00
Akim Demaille
e419eda5e2 regen 2020-01-10 18:38:02 +01:00
Akim Demaille
8036635251 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-05 10:26:35 +01:00
Akim Demaille
d55f240991 parser: pretend we are Bison 3.5
* src/parse-gram.y: Accept we're Bison 3.5.
2019-12-08 16:03:36 +01:00
Akim Demaille
6dca1eb950 regen 2019-12-06 08:27:55 +01:00