Commit Graph

1431 Commits

Author SHA1 Message Date
Akim Demaille
187c2ac344 c++: report the stack at the same places as in C
Let's have C be the reference, and match it elsewhere.  Maybe C is too
verbose and some adjustments are needed, but then that would be done
in another batch of patches.

* data/skeletons/lalr1.cc: Print the stack once we popped after
YYERROR, and before emptying the stack at the end of parsing.
2020-01-15 06:22:42 +01:00
Akim Demaille
f06c0d2c05 c++: display the stack in the same order as in C
Currently the C and C++ parse traces differ in the order in which the
stack is displayed: bottom up in C, top down in C++.  Let's stick to
the C order.

* data/skeletons/stack.hh (stack::iterator, stack::const_iterator)
(begin, end): Be forward, not backward.
2020-01-15 06:22:31 +01:00
Akim Demaille
1c0adb410d yacc.c: comment changes
In particular, import Adrian Vogelsgesang's comments about LAC from
lalr1.cc.

* data/skeletons/yacc.c: here.
2020-01-11 18:01:39 +01:00
Akim Demaille
32b529f038 yacc.c: style: double-quote the argument of b4_percent_define_get
* data/skeletons/yacc.c: Here, for consistency.
2020-01-11 17:35:24 +01:00
Akim Demaille
46cce832fd yacc.c: introduce yysymbol_name
Provide the users with a public API to get the name of the tokens.  A
thin wrapper around yytname.

* data/skeletons/yacc.c (yysymbol_name): New.
Use it.
2020-01-11 16:14:06 +01:00
Akim Demaille
a0675d707f Merge branch 'maint' into HEAD
* maint:
  gnulib: update
  lalr1.cc: avoid static_cast
  glr.c: add missing cast
  regen
  package: bump copyrights to 2020
  gitignore: update
2020-01-11 07:38:39 +01:00
Akim Demaille
3dec8a4caf lalr1.cc: avoid static_cast
Reported by donmac703.
Fixes https://github.com/akimd/bison/issues/20.

* data/skeletons/lalr1.cc: here.
2020-01-10 19:31:00 +01:00
Akim Demaille
2cb52c5a91 glr.c: add missing cast
Reported by psjo.
Fixes https://github.com/akimd/bison/issues/19.

* data/skeletons/glr.c (yyprocessOneStack): Here.
2020-01-10 19:30:54 +01:00
Akim Demaille
c67daa9a97 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-10 19:16:23 +01:00
Akim Demaille
2116af766a yacc.c: simplify use of YYDPRINTF
* data/skeletons/yacc.c (YYDPRINTF): Expand to no-op (instead of
nothing) when disabled.
Simplify callers.
2020-01-09 09:02:55 +01:00
Akim Demaille
8036635251 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-05 10:26:35 +01:00
Akim Demaille
86a3ec0f8d glr.c: no longer support YYERROR_VERBOSE
* data/skeletons/glr.c: Rather, dispatch directly on parse.error's
value.
2020-01-04 09:14:19 +01:00
Akim Demaille
7122d747cf yacc.c: no longer support YYERROR_VERBOSE
Supporting YYERROR_VERBOSE via cpp is a nuisance: m4 is in charge of
handling alternatives.  When adding more options for %define
parse.error, supporting both CPP and M4 is too complex.  Anyway,
YYERROR_VERBOSE was deprecated long ago.

* data/skeletons/yacc.c: Use m4 only to handle verbose/simple error
messages.
2020-01-04 09:12:43 +01:00
Akim Demaille
69fe4b9eb6 yacc.c: avoid negations
* data/skeletons/yacc.c (yyerrlab): here.
2020-01-03 09:07:40 +01:00
Akim Demaille
385fb345bf glr.c: clarify yyreportSyntaxError
See the previous commit.

* data/skeletons/glr.c (yyreportSyntaxError): First compute the
arguments of the error message, _then_ th error message size.
2019-12-31 12:00:04 +01:00
Akim Demaille
f983d00e77 yacc: restructure and fix yysyntax_error
I would like to offer new ways to build the error message.  As a first
step, let's simplify yysyntax_error whose first loop does two things
at the same time: (i) collect the tokens to be reported in the error
message, and (ii) accumulate their sizes and possibly return
"overflow".  Let's pull (ii) in a second step.

Then test 525 (regression.at:1193: parse.error=verbose overflow)
failed.  This test checks that we correctly report "memory overflow"
when the error message is too large.  However the test is mistaken: it
is triggered in a place where there are five (large) expected tokens,
so anyway we would not display them, so there is no (memory) overflow
here!  Transform this test to (i) check that indeed there is no
overflow, and (ii) create syntax_error3 which does check the intended
behavior, but with four expected tokens.

* data/skeletons/yacc.c (yysyntax_error): First compute the list of
arguments, then compute yysize.
* tests/regression.at (parse.error=verbose overflow): Enhance and fix.
2019-12-31 12:00:04 +01:00
Akim Demaille
b10366f296 glr.cc: avoid compiler warnings
381. types.at:366: testing glr.cc api.value.type={double} ...
    test.cc:207:57: error: "__clang_major__" is not defined, evaluates to 0 [-Werror=undef]
      207 | #if defined __APPLE__ && YY_CPLUSPLUS < 201103L && 4 <= __clang_major__
          |                                                         ^~~~~~~~~~~~~~~

* data/skeletons/glr.cc: Check __clang_major__ before using it.
2019-12-29 11:13:00 +01:00
Paul Eggert
139d065594 warnings: pacify ‘gcc -Wchar-subscripts’ in yacc.c
Problem reported by Andy Fiddaman in:
https://lists.gnu.org/r/bug-bison/2019-12/msg00021.html
* data/skeletons/yacc.c (yy_reduce_print, yy_lac, yysyntax_error)
(yyreturn): If I might be a char, write a[+I] instead of a[I],
so that ‘gcc -Wchar-subscripts’ does not complain.
2019-12-18 13:35:28 -08:00
Akim Demaille
b3abe014f2 glr.cc: disable warnings from Clang on macOS
$ cat test.cc
    #include <stddef.h>
    #include <stdint.h>

    ptrdiff_t half_max_capacity = PTRDIFF_MAX;
    $ clang++-mp-9.0 -pedantic -std=c++98 /tmp/test.cc -c
    /tmp/test.cc:4:31: warning: 'long long' is a C++11 extension [-Wc++11-long-long]
    ptrdiff_t half_max_capacity = PTRDIFF_MAX;
                                  ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/stdint.h:149:23:
            note: expanded from macro 'PTRDIFF_MAX'
    #define PTRDIFF_MAX       INT64_MAX
                              ^
    /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/stdint.h:75:26:
            note: expanded from macro 'INT64_MAX'
    #define INT64_MAX        9223372036854775807LL
                             ^
    1 warning generated.

* data/skeletons/glr.cc: here.
2019-12-08 16:34:53 +01:00
Akim Demaille
80f3220fea api.token.raw: fix it in C++
Another breakage revealed by vcsn.

* data/skeletons/c++.m4 (yytranslate_): Do not hard code "yy" and
"parser", both can be changed by the user.
Actually, since we are in the parser itself, there's really no need to
qualify the type.
2019-12-08 16:07:50 +01:00
Akim Demaille
fc2040a750 c++: fix comments for %code blocks
In a project of mine, vcsn, this commit fixes the following comments.

    --- /tmp/parse.hh	2019-12-08 15:51:24.792934703 +0100
    +++ lib/vcsn/rat/parse.hh	2019-12-08 16:00:59.137107503 +0100
    @@ -43,7 +43,7 @@

     #ifndef YY_YY_USERS_AKIM_SRC_LRDE_2_LIB_VCSN_RAT_PARSE_HH_INCLUDED
     # define YY_YY_USERS_AKIM_SRC_LRDE_2_LIB_VCSN_RAT_PARSE_HH_INCLUDED
    -// //                    "%code requires" blocks.
    +// "%code requires" blocks.
     #line 20 "/Users/akim/src/lrde/2/lib/vcsn/rat/parse.yy"

       #include <iostream>
    @@ -1851,7 +1851,7 @@

    -// //                    "%code provides" blocks.
    +// "%code provides" blocks.
     #line 60 "/Users/akim/src/lrde/2/lib/vcsn/rat/parse.yy"

       #define YY_DECL_(Class) \

* data/skeletons/bison.m4 (b4_percent_code_get): Pass an expanded
string to b4_comment.
2019-12-08 16:03:36 +01:00
Akim Demaille
4f961a706d c++: fix spello
* data/skeletons/lalr1.cc: here.
2019-12-08 15:42:41 +01:00
Akim Demaille
046f238826 d: obey parse.error
* data/skeletons/lalr1.d (yysyntax_error): Let the dispatch be
bison-time, not runtime.
2019-12-07 13:23:45 +01:00
Akim Demaille
9bf06f6963 c++: also prefer YY_ASSERT to YYASSERT
Like the other skeletons.

* data/skeletons/variant.hh: here.
2019-12-07 13:23:45 +01:00
Akim Demaille
357336d254 glr.c: obey the parse.assert %define variable
* data/skeletons/glr.c (YYASSERT): Rename as...
(YY_ASSERT): this, for consistency with yacc.c, and also to emphasize
the fact that this is not for the end user (YY_ prefix).
* tests/glr-regression.at: Define parse.assert.
2019-12-07 13:23:45 +01:00
Akim Demaille
d4a6c3c58a c++: beware of short ranges for state numbers
Now that we use small integral types, possibly unsigned (e.g.,
unsigned char), to store state numbers, using -1 to denote an empty
state (i.e., a state that stores no semantical value) is very
dangerous: it will be confused with state 255, which might be
non-empty.

Rather than allocating a larger range of state numbers to keep the
empty-state apart, let's use the number of a state known to store no
value.  The initial state, numbered 0, seems to fit perfectly the job.

Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html

* data/skeletons/lalr1.cc (empty_state): Be 0.
2019-12-07 09:22:55 +01:00
Akim Demaille
f8d82ff039 warnings: enable -Wuseless-cast, and eliminate warnings
Prompted by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html.

* configure.ac (warn_cxx): Add -Wuseless-cast.
* data/skeletons/c.m4 (b4_attribute_define): Define
YY_IGNORE_USELESS_CAST_BEGIN and YY_IGNORE_USELESS_CAST_END.
* data/skeletons/glr.c (YY_FPRINTF): New, replaces YYFPRINTF, wrapped
with YY_IGNORE_USELESS_CAST_BEGIN and YY_IGNORE_USELESS_CAST_END.
(YY_DPRINTF): Likewise.
* tests/actions.at: Remove useless cast.
* tests/headers.at: Adjust.
2019-12-06 08:27:55 +01:00
Akim Demaille
9e9e49224f diagnostics: style changes
* src/complain.h, src/complain.c: Comment changes.
* src/scan-skel.l: Reduce scopes.
* data/skeletons/bison.m4: Factor diagnostic functions.
2019-12-02 19:35:01 +01:00
Akim Demaille
8b53f4e022 glr.c: style changes
* data/skeletons/glr.c (yysplitStack): Reduce scopes.
* tests/atlocal.in: Formatting changes.
2019-12-02 19:34:48 +01:00
Akim Demaille
8c87a62308 c++: get rid of symbol_type::token ()
It is not used.  And its implementation was wrong when api.token.raw
was defined, as it was still mapping to the external token numbers,
instead of the internal ones.  Besides it was provided only when
api.token.constructor is defined, yet always declared.

* data/skeletons/c++.m4 (by_type::token): Remove, useless.
2019-12-01 10:05:48 +01:00
Akim Demaille
478cb5cf12 c++: remove useless cast about user_token_number_max_
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html

The cast is needed when yytranslate_'s argument type is token_type,
i.e., when api.token.constructor is defined.

    373. types.at:138: testing lalr1.cc api.value.type=variant api.token.constructor ...
    ======== Testing with C++ standard flags: ''
    ../../tests/types.at:138: bison --color=no -fno-caret  -o test.cc test.y
    ../../tests/types.at:138: $CXX $CXXFLAGS $CPPFLAGS  $LDFLAGS -o test test.cc $LIBS
    stderr:
    test.cc:966:16: error: result of comparison of constant 257 with
                    expression of type 'yy::parser::token_type'
                   (aka 'yy::parser::token::yytokentype') is always true
                   [-Werror,-Wtautological-constant-out-of-range-compare]
        else if (t <= user_token_number_max_)
                 ~ ^  ~~~~~~~~~~~~~~~~~~~~~~
    1 error generated.

It is because it is expected that when api.token.constructor is
defined, only symbol constructors will be used, that yytranslate_ then
takes a token_type.  But it is wrong: we still allow literal
characters in this case, as demonstrated by test 373 for instance.

    %define api.value.type variant
    %define api.token.constructor
    %token <std::pair<int, int>> '1' '2';
    [...]
    static yy::parser::symbol_type yylex ()
    {
      static char const input[] = "12";
      int res = input[toknum++];
      typedef yy::parser::symbol_type symbol;
      if (res)
        return symbol (res, std::make_pair (res - '0', res - '0' + 1));
      else
        return symbol (res);
    }

So let yytranslate_ always take an int, which makes the cast truly
useless.

* data/skeletons/c++.m4, data/skeletons/lalr1.cc (yytranslate_): here.
2019-12-01 08:53:58 +01:00
Akim Demaille
94f70bd861 c++: clean a few issues wrt special tokens
The C++ implementation of LAC did not skip the $undefined token,
probably because it was not exposed.  Expose it, and use clearer
names.

* data/skeletons/c++.m4: Don't define undef_token_ in yytranslate_,
but...
* data/skeletons/lalr1.cc (yy_undef_token_): here.
Use a more precise type to define yy_undef_token_ and yy_error_token_.
Unfortunately we move from a compile-time value defined via an enum to
a static const member.  Eventually we should make it constexpr.
Make LAC implementation more alike yacc.c's one.
2019-12-01 08:08:19 +01:00
Akim Demaille
9b4f0970fe d, java: improve yytranslate and neighbors
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: Don't expose
yyuser_token_number_max_ and yyundef_token_.  Do as in C++: scope them
into yytranslate_, and only when api.token.raw is not defined.
(yyterror_): Rename as...
(yy_error_token_): this.
* data/skeletons/lalr1.d (token_number_type): New.
Use it.
Can't be done in the Java backend, as Java does not have type aliases.
2019-12-01 07:59:23 +01:00
Akim Demaille
869028a66d d, java: get rid of a useless table
* data/skeletons/lalr1.d, data/skeletons/lalr1.java (yytoken_number_):
Remove, useless.
Was used in ancient C skeletons to support YYPRINT, long obsoleted by
%printer.
2019-12-01 07:38:31 +01:00
Akim Demaille
6f92a7f664 c++, d, java: remove yyerrcode
It is not used at all.  We will remove it also from yacc.c, but
later (see TODO).

* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java (yyerrcode_):
Remove.
2019-11-30 17:30:48 +01:00
Akim Demaille
6a61b6b17e c++: improve typing
* data/skeletons/lalr1.cc (yysyntax_error_): symbol_type::type_get
returns a symbol_number_type (which is indeed an int).
2019-11-30 17:30:48 +01:00
Akim Demaille
a4bf7cdf9e c++: remove useless cast about yyeof_
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html

* data/skeletons/c++.m4 (b4_yytranslate_define): Don't use yyeof_ as
if it had two different types.
It is used once against the input argument, which is the value
returned by yylex, which is an "external token number", typically an
int.  It is also used as output type, an "internal symbol number".
It turns out that in both cases we mean "0", but let's keep yyeof_
only for the case "internal symbol number", i.e., _after_ conversion
by yytranslate.
This frees us from one cast.
2019-11-30 17:30:48 +01:00
Akim Demaille
9471a5ffe9 glr: style change
* data/skeletons/glr.c (YYDPRINTF): Expand into an empty statement,
instead of nothing.
Simplify callers.
2019-11-30 14:41:16 +01:00
Akim Demaille
24c5214ae8 glr: remove useless casts
Reported by GCC's -Wuseless-cast.

* data/skeletons/glr.c: Don't cast to yybool, it's useless.
2019-11-30 14:41:16 +01:00
Akim Demaille
2f7097d1b1 yacc.c, glr.c: fix crash when reporting errors in consistent states
The current code for yysyntax_error for %define parse.error verbose is
fishy (given that YYEMPTY is -2, invalid argument for yytname[]):

    static int
    yysyntax_error ([...])
    {
      YYPTRDIFF_T yysize0 = yytnamerr (YY_NULLPTR, yytname[yytoken]);
    [...]
      if (yytoken != YYEMPTY)

A nearby comment reports

    The only way there can be no lookahead present (in yychar) is if
    this state is a consistent state with a default action.  Thus,
    detecting the absence of a lookahead is sufficient to determine
    that there is no unexpected or expected token to report.  In that
    case, just report a simple "syntax error".

So it _is_ possible to call yysyntax_error with yytoken == YYEMPTY,
albeit quite difficult when meaning to, so virtually impossible by
accident (after all, there was never a bug report about this).

I failed to produce a test case, but Joel E. Denny provided me with
one (added to the test suite below).  The yacc.c skeleton fails on
this, and once fixed dies on a second problem.  The glr.c skeleton was
also dying, but immediately of this second problem.

Indeed we were not allocating space for the error message's final \0.
This was hidden by the fact that we only had error messages with at
least an unexpected token displayed, so with at least one "%s" in the
format string, whose size (2) was included (incorrectly) in the final
size of the message (where the %s have been replaced by the actual
content).

* data/skeletons/glr.c, data/skeletons/yacc.c (yysyntax_error):
Do not invoke yytnamerr on YYEMPTY.
Clarify the computation of the length of the _final_ error message,
with the NUL terminator but without the '%s's.
* tests/conflicts.at (Syntax error in consistent error state):
New, contributed by Joel E. Denny.
2019-11-29 18:21:43 +01:00
Akim Demaille
ad32ec64c8 style: pacify syntax-check
* cfg.mk: No need to translate *.md files.
* data/skeletons/glr.c, data/skeletons/yacc.c: Fix space issues.
2019-11-20 07:10:27 +01:00
Akim Demaille
7bdf7246fb c++: expose the type used to store line and column numbers
* data/skeletons/location.cc (position::counter_type)
(location::counter_type): New.
Use them.
* doc/bison.texi (C++ position, C++ location): Adjust.
2019-11-06 18:20:15 +01:00
Akim Demaille
3398b0fa90 c++: fix old cast warnings
We still have a few old C casts in lalr1.cc, let's get rid of them.
Reported by Frank Heckenbach.

Actually, let's monitor all our casts using easy to grep macros.
Let's use these macros to use the C++ standard casts when we are in
C++.

* data/skeletons/c.m4 (b4_cast_define): New.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/stack.hh,
* data/skeletons/yacc.c:
Use it and/or its casts.

* tests/actions.at, tests/cxx-type.at,
* tests/glr-regression.at, tests/headers.at, tests/torture.at,
* tests/types.at:
Use YY_CAST instead of C casts.

* configure.ac (warn_cxx): Add -Wold-style-cast.
* doc/bison.texi: Disable it.
2019-11-02 16:40:50 +01:00
Akim Demaille
28f1e1546c C++: finish propagating the unsigned->signed conversion in locations
* data/skeletons/location.cc: Remove the u (for unsigned) suffix from
the initial line and column.
* NEWS: AFAICT, only C++ backends have their location types changed.
2019-10-29 09:15:25 +01:00
Akim Demaille
fead28d9e3 style: glr.c: comment changes
* data/skeletons/glr.c: here.
2019-10-29 08:59:18 +01:00
Akim Demaille
0cbefb71e8 lalr1.cc: fix previous commit: printing of state numbers
* data/skeletons/lalr1.cc: Printing a char prints... a char.
Print ints instead.
2019-10-24 23:02:26 +02:00
Akim Demaille
402332c4b6 lalr1.cc: use computed state types
This skeleton uses a single stack of state structures, so it is less
likely to benefit from a stack size reduction than yacc.c (which uses
several stacks: state number, value and location).  But it will reduce
the size of the LAC stack.

This skeleton was already using int for state numbers, so, contrary to
yacc.c, this brings nothing for large automata.

Overall, it is still nicer to make the skeletons alike.

* data/skeletons/lalr1.cc (state_type): Here.
2019-10-24 18:16:01 +02:00
kaneko y
c86b7815fc yacc.c: fix a typo
* data/skeletons/yacc.c (yysetstate): fix comment.
2019-10-22 19:05:02 +02:00
Paul Eggert
54c5d5d1b4 c++: port to Sun C++ 5.12
The documentation for Oracle Solaris Studio 12.3 (Sun C++ 5.12
2011/11/16) says it supports C++03.  This compiler rejects the
location.cc use of std::max for some reason; I don’t know why
since I don’t use C++ as a rule.  The simplest workaround is to
open-code ‘max’.
* data/skeletons/location.cc (add_):
Do max by hand rather than relying on std::max.
Don’t include <algorithm.h>; no longer needed.
2019-10-17 12:25:05 -07:00
Paul Eggert
83c9051a64 c: port YY_ATTRIBUTE_UNUSED to Sun C 5.12
Sun C 5.12 defines __SUNPRO_C to 0x5120 but diagnoses
‘__attribute__ ((__unused__))’.  Change the ifdefs to use
the same method as Gnulib in this area.
* data/skeletons/c.m4 (YY_ATTRIBUTE): Remove, since
not all attributes were added in the same compiler version.
(YY_ATTRIBUTE_PURE, YY_ATTRIBUTE_UNUSED):
Use specific GCC version for each attribute.
Pay no attention to __SUNPRO_C.
* tests/headers.at (Several parsers): Tighten tests accordingly.
2019-10-17 11:51:20 -07:00