Commit Graph

131 Commits

Author SHA1 Message Date
Akim Demaille
75c3746ce2 options: rename --defines as --header
The name "defines" is incorrect, the generated file contains far more
than just #defines.

* src/getargs.h, src/getargs.c (-H, --header): New option.
With optional argument, just like --defines, --xml, etc.
(defines_flag): Rename as...
(header_flag): this.
Adjust dependencies.
* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/glr2.cc, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c:
Adjust.
* examples, doc/bison.texi: Adjust.
* tests/headers.at, tests/local.at, tests/output.at: Convert most
tests from using --defines to using --header.
2020-09-19 08:31:49 +02:00
Akim Demaille
a1b7fef045 c: always use YYMALLOC/YYFREE
Reported by Kovalex <kovalex.pro@gmail.com>.
https://lists.gnu.org/r/bug-bison/2020-08/msg00015.html

* data/skeletons/yacc.c: Don't make direct calls to malloc/free.
* tests/calc.at: Check it.
2020-08-30 10:05:18 +02:00
Akim Demaille
0820f16ca8 style: update comments
* src/reader.c: action_obstack was removed in 2002...
* src/parse-gram.y: Better names.
* src/scan-code.h: More comments.
2020-07-05 09:59:45 +02:00
Akim Demaille
49f1e5f428 style: update comments in the skeletons
* data/skeletons/c++.m4, data/skeletons/glr.c, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c:
Be more accurate about yychar and yytoken.
Don't name local variables as if they were members.
2020-07-05 09:59:25 +02:00
Akim Demaille
330552ea49 yacc.c: push: don't clear the parser state when accepting/rejecting
Currently when a push parser finishes its parsing (i.e., it did not
return YYPUSH_MORE), it also clears its state.  It is therefore
impossible to see if it had parse errors.

In the context of autocompletion, because error recovery might have
fired, the parser is actually already in a different state.  For
instance on `(1 + + <TAB>` in the bistromathic, because there's a
`exp: "(" error ")"` recovery rule, `1 + +` tokens have already been
popped, replaced by `error`, and autocompletions think we are ready
for the closing ")".  So here, we would like to see if there was a
syntax error, yet `yynerrs` was cleared.

In the case of a successful parse, we still have a problem: if error
recovery succeeded, we won't know it, since, again, `yynerrs` is
clearer.

It seems much more natural to leave the parser state available for
analysis when there is a failure.

To reuse the parser, we should either:

1. provide an explicit means to reinitialize a parser state for future
   parses.

2. automatically reset the parser state when it is used in a new
   parse.

Option 2 requires to check whether we need to reinitialize the parser
each time we call `yypush_parse`, i.e., each time we give a new token.
This seems expensive compared to Option 1, but benchmarks revealed no
difference.  Option 1 is incompatible with the documentation
("After `yypush_parse` returns a status other than `YYPUSH_MORE`, the
parser instance `yyps` may be reused for a new parse.").

So Option 2 wins, reusing the private `yynew` member to record that a
parse was finished, and therefore that the state must reset in the
next call to `yypull_parse`.

While at it, this implementation now reuses the previously enlarged
stacks from one parse to another.

* data/skeletons/yacc.c (yypstate_new): Set up the stacks in their
initial configurations (setting their bottom to the stack array), and
use yypstate_clear to reset them (moving their top to their bottom).
(yypstate_delete): Adjust.
(yypush_parse): At the beginning, clear yypstate if needed, and at the
end, record when yypstate needs to be clearer.

* examples/c/bistromathic/parse.y (expected_tokens): Do not propose
autocompletion when there are parse errors.
* examples/c/bistromathic/bistromathic.test: Check that case.
2020-06-29 19:36:41 +02:00
Akim Demaille
ed10c308fa yacc.c: simplify initialization of push parsers
The previous commit ("yacc.c: declare and initialize and the same
time") made b4_initialize_parser_state_variables useless.

* data/skeletons/yacc.c (b4_initialize_parser_state_variables): Inline
into...
(yypstate_clear): here.
2020-06-29 19:10:05 +02:00
Akim Demaille
29520abb3b yacc.c: declare and initialize and the same time
In order to factor the code of push and pull parsers, the declaration
of the parser's state variable was common (being local variable in
pull parsers, and struct members in push parsers).  This result in
rather poor style in pull parser, with first variable declarations,
and then their initializations.

The initialization is about to differ between push and pull parsers,
so it is no longer worth keeping both cases together.

* data/skeletons/yacc.c (b4_declare_parser_state_variables): Accept an
argument, and when it is set, initialize the variables.
Adjust dependencies.
2020-06-29 19:10:05 +02:00
Akim Demaille
2491de1eef yacc.c: style changes in push mode
* data/skeletons/yacc.c: here.
2020-06-29 19:10:05 +02:00
Akim Demaille
ec207d1bb2 yacc.c: simplify yypull_parse
Currently yypull_parse takes a yypstate* as argument, and accepts it
to be NULL.  This does not seem to make a lot of sense: rather it is
its callers that should do that.

I believe this is historical: yypull_parse was introduced
first (c3d503425f), with yyparse being a
macro.  So yyparse could hardly deal with memory allocation properly.
In 7172e23e8f that yyparse was turned
into a genuine function.  At that point, it should have allocated its
own yypstate*, which would have left yypull_parse deal with only one
single non-null ypstate* argument.

Fortunately, it is nowhere documented that it is valid to pass NULL to
yypull_parse.  It is now forbidden.

* data/skeletons/yacc.c (yypull_parse): Don't allocate a yypstate.
Needs a location to issue the error message.
(yyparse): Allocate the yypstate.
2020-06-29 19:10:05 +02:00
Akim Demaille
a53c6026cd api.header.include: document it, and fix its default value
While defining api.header.include worked as expected, its default
value was incorrectly defined.  As a result, by default, the generated
parsers still duplicated the content of the generated header instead
of including it.

* data/skeletons/yacc.c (api.header.include): Fix its default value.
* tests/output.at: Check it.
* doc/bison.texi (%define Summary): Document api.header.include.
While at it, move the definition of api.namespace at the proper
place.
2020-06-09 08:09:26 +02:00
Akim Demaille
7e16bd2cae Merge maint into HEAD
* upstream/maint:
  maint: post-release administrivia
  version 3.6.3
  build: check -Wmissing-prototypes
  tests: show logs
  c++: fix printing of state number on streams
2020-06-03 08:12:10 +02:00
Akim Demaille
52ce2a008b build: check -Wmissing-prototypes
pstate_clear is lacking a prototype.
Reported by Ryan
https://lists.gnu.org/r/bug-bison/2020-05/msg00101.html

Besides, none of the C examples were compiled with the warning flags.

* configure.ac (warn_c): Add -Wmissing-prototypes.
* data/skeletons/yacc.c (pstate_clear): Make it static.
* examples/local.mk (TEST_CFLAGS): New.
* examples/c/bistromathic/local.mk, examples/c/calc/local.mk,
* examples/c/lexcalc/local.mk, examples/c/mfcalc/local.mk,
* examples/c/pushcalc/local.mk, examples/c/reccalc/local.mk,
* examples/c/rpcalc/local.mk:
Use it.

GCC's warn_unused_result is not silenced by a cast to void, so we have
to "use" scanf's result.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25509
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

Flex generated code produces too many warnings, including things such
as, with ICC:

    examples/c/lexcalc/scan.c(1088): error #1682: implicit conversion
              of a 64-bit integral type to a smaller integral type (potential portability problem)
    2259                YY_INPUT( (&YY_CURRENT_BUFFER_LVALUE->yy_ch_buf[number_to_move]),
    2260                ^
    2261
    2262

I am tired of trying to fix Flex's output.  The project does not seem
maintained.  We ought to avoid it.  So, for the time being, don't try
to enable warnings with Flex.

* examples/c/bistromathic/parse.y, examples/c/reccalc/scan.l: Fix
warnings.
* doc/bison.texi: Discard scanf's return value to defeat
-Werror=unused-result.
2020-06-01 08:29:53 +02:00
Joshua Watt
dd878d1851 bison: add command line option to map file prefixes
Teaches bison about a new command line option, --file-prefix-map OLD=NEW
(based on the -ffile-prefix-map option from GCC) which causes it to
replace and file path of OLD in the text of the output file with NEW,
mainly for header guards and comments. The primary use of this is to
make builds reproducible with different input paths, and in particular
the debugging information produced when the source code is compiled. For
example, a distro may know that the bison source code will be located at
"/usr/src/bison" and thus can generate bison files that are reproducible
with the following command:

    bison --output=/build/bison/parse.c -d --file-prefix-map=/build/bison/=/usr/src/bison/ parse.y

Importantly, this will change the header guards and #line directives
from:

    #ifndef YY_BUILD_BISON_PARSE_H
    #line 100 "/build/bison/parse.h"

to

    #ifndef YY_USR_SRC_BISON_PARSE_H
    #line 100 "/usr/src/bison/parse.h"

which is reproducible.

See https://lists.gnu.org/r/bison-patches/2020-05/msg00016.html
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>

* src/files.h, src/files.c (spec_mapped_header_file)
(mapped_dir_prefix, map_file_name, add_prefix_map): New.
* src/getargs.c (-M, --file-prefix-map): New option.
* src/output.c (prepare): Define b4_mapped_dir_prefix and
b4_spec_header_file.
* src/scan-skel.l (@ofile@): Output the mapped file name.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/location.cc,
* data/skeletons/yacc.c:
Adjust.
* doc/bison.texi: Document.
* tests/input.at, tests/output.at: Check.
2020-05-24 15:17:15 +02:00
Akim Demaille
e7aff57122 style: rename user_token_number as code
This should have been done in 3.6, but I wanted to avoid introducing
conflicts into Vincent's work on counterexamples.  It turns out it's
completely orthogonal.

* data/README.md, data/skeletons/bison.m4, data/skeletons/c++.m4,
* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/java.m4,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/variant.hh, data/skeletons/yacc.c, src/conflicts.c,
* src/derives.c, src/gram.c, src/gram.h, src/output.c,
* src/parse-gram.c, src/parse-gram.y, src/print-xml.c, src/print.c,
* src/reader.c, src/symtab.c, src/symtab.h, tests/input.at,
* tests/types.at:
s/user_token_number/code/g.
Plus minor changes.
2020-05-23 08:43:58 +02:00
Akim Demaille
2da399d15f yacc.c: restore ansi-c compatibility
Reported by neok-m4700.
https://github.com/akimd/bison/issues/37

* data/skeletons/yacc.c: Don't use // comments.
2020-05-09 17:05:03 +02:00
Akim Demaille
d9a9b054ae all: fix the interface of yyexpected_tokens
The user gives yyexpected_tokens a limit: the max number of tokens she
wants to hear about.  That's because an error message that reports a
bazillion of possible tokens is useless.

In that case yyexpected_tokens returned 0, so the user would not know
if there are too many expected tokens or none (yes, that's possible).

There are several ways to tell the user in which situation she's in:

- return some E2MANY, a negative value.  Then it makes the pattern

    int argsize = yypcontext_expected_tokens (ctx, arg, ARGS_MAX);
    if (argsize < 0)
      return argsize;

  no longer valid, as for E2MANY (i) the user must generate the error
  message anyway, and (ii) she should not return E2MANY

- return ARGS_MAX + 1.  Then it makes it dangerous for the user, as
  she has to iterate update `min (ARGS_MAX, argsize)`.

Returning 0 is definitely simpler and safer for the user, as it tells
her "this is not an error, just generate your message without a list
of expecting tokens".  So let's still return 0, but set arg[0] to the
empty token when the list is really empty.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.java
* data/skeletons/yacc.c (yyexpected_tokens): Put the empty symbol
first if there are no possible tokens at all.
* examples/c/bistromathic/parse.y: Demonstrate how to use that.
2020-05-06 08:11:52 +02:00
Akim Demaille
4b85b969d0 glr.c: beware of portability issues with PTRDIFF_MAX
For instance test 386, "glr.cc api.value.type={double}":

    types.at:366: $CXX $CXXFLAGS $CPPFLAGS  $LDFLAGS -o test test.cc $LIBS
    stderr:
    test.cc: In function 'ptrdiff_t yysplitStack(yyGLRStack*, ptrdiff_t)':
    test.cc:490:4: error: 'PTRDIFF_MAX' was not declared in this scope
       (PTRDIFF_MAX < SIZE_MAX ? PTRDIFF_MAX : YY_CAST (ptrdiff_t, SIZE_MAX))
        ^
    test.cc:1805:37: note: in expansion of macro 'YYSIZEMAX'
           ptrdiff_t half_max_capacity = YYSIZEMAX / 2 / state_size;
                                         ^~~~~~~~~
    test.cc:490:4: note: suggested alternative: '__PTRDIFF_MAX__'
       (PTRDIFF_MAX < SIZE_MAX ? PTRDIFF_MAX : YY_CAST (ptrdiff_t, SIZE_MAX))
        ^
    test.cc:1805:37: note: in expansion of macro 'YYSIZEMAX'
           ptrdiff_t half_max_capacity = YYSIZEMAX / 2 / state_size;
                                         ^~~~~~~~~

The failing tests are using glr.cc only, which I don't understand, the
problem is rather in glr.c, so I would expect glr.c tests to also fail.

Reported by Bruno Haible.
https://lists.gnu.org/archive/html/bug-bison/2020-05/msg00053.html

* data/skeletons/yacc.c: Move the block that defines
YYPTRDIFF_T/YYPTRDIFF_MAXIMUM, YYSIZE_T/YYSIZE_MAXIMUM, and
YYSIZEOF to...
* data/skeletons/c.m4 (b4_sizes_types_define): Here.
(b4_c99_int_type): Also take care of the #undefinition of short.
* data/skeletons/yacc.c, data/skeletons/glr.c: Use
b4_sizes_types_define.
* data/skeletons/glr.c: Adjust to use YYPTRDIFF_T/YYPTRDIFF_MAXIMUM,
YYSIZE_T/YYSIZE_MAXIMUM.
2020-05-04 07:44:42 +02:00
Akim Demaille
76c3bccf40 yacc.c: improve formatting of the generated code
* data/skeletons/yacc.c (yy_reduce_print): here.
2020-05-02 10:17:01 +02:00
Akim Demaille
fb1d76d9a9 yacc.c: avoid the use of a temporary
* data/skeletons/yacc.c: Use YYLLOC_DEFAULT directly with the final
destination.
2020-04-30 08:07:55 +02:00
Akim Demaille
3b05de2d05 yacc.c: install backward compatibility for YYERRCODE
Some people have been using that symbol.  Some even have #defined it
themselves.
https://lists.gnu.org/r/bison-patches/2020-04/msg00138.html

Let's provide backward compatibility, having it point to YYUNDEF, so
that an error message is generated.

* data/skeletons/yacc.c (YYERRCODE): New, at the exact same location
it was defined before.
2020-04-28 08:26:49 +02:00
Akim Demaille
b254b36db8 all: don't emit an error message when the scanner returns YYERRCODE
I'm quite pleased to see that the tricky case of glr.c was already
prepared by the changes to support syntax_error exceptions.  Better
yet, it is actually syntax_error that becomes a special case of the
general pattern: make yytoken be YYERRCODE.

* data/skeletons/glr.c (YYFAULTYTOK): Remove the now useless (Basil)
Faulty token.
Instead, use the error token.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: When computing
the action, first check the case of the error token.

* tests/calc.at: Check cases for the error token symbols before and
after it.
2020-04-26 19:55:52 +02:00
Akim Demaille
58e79539fc c: don't emit an error message when the scanner returns YYERRCODE
* data/skeletons/yacc.c (yyparse): When the scanner returns YYERRCODE,
go directly to error recovery (yyerrlab1).
However, don't keep the error token as lookahead, that token is too
special.
* data/skeletons/lalr1.cc: Likewise.

* examples/c/bistromathic/parse.y (yylex): Use that feature to report
nicely invalid characters.
* examples/c/bistromathic/bistromathic.test: Check that.
* examples/test: Neutralize gratuitous differences such as rule
position.

* tests/calc.at: Check that case in C only.
The other case seem to be working, but that's an illusion that the
next commit will address (in fact, they can enter endless loops, and
report the error several times anyway).
2020-04-26 18:05:30 +02:00
Akim Demaille
286d0755f8 all: prefer YYERRCODE to YYERROR
We will not keep YYERRCODE anyway, it causes backward compatibility
issues.  So as a first step, let all the skeletons use that name,
until we have a better one.

* data/skeletons/bison.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c, doc/bison.texi, tests/headers.at,
* tests/input.at:
here.
2020-04-26 15:09:51 +02:00
Akim Demaille
9c72d3c5a8 style: prefer b4_has_translations_if
* data/skeletons/glr.c, data/skeletons/yacc.c: here.
2020-04-26 13:34:30 +02:00
Akim Demaille
22eeb1ab8a style: fix a few remaining 'type' instead of 'kind'
* data/skeletons/glr.c, data/skeletons/yacc.c (YY_SYMBOL_PRINT):
Here.
2020-04-26 10:57:22 +02:00
Akim Demaille
c4dbc1776c skeletons: make the warning about implementation details clearer
* data/skeletons/bison.m4 (b4_disclaimer): Here.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: Use it.
2020-04-26 10:57:02 +02:00
Akim Demaille
b74fc07d21 style: c: fix a few minor issues about indentation of cpp directives
* README-hacking.md: More about cpp.
* data/skeletons/c.m4, data/skeletons/yacc.c: Style changes.
2020-04-25 12:16:57 +02:00
Akim Demaille
150dc95395 style: clarify #endif
We could try to avoid the weird "#if 1", but then the indentation of
the inner #if would be wrong.  Let' keep it this way.

* data/skeletons/yacc.c: here.
Also, avoid sticking the comment to the directive.
2020-04-25 11:06:16 +02:00
Akim Demaille
81334eb5a0 c, c++: provide a default definition for N_
In C/C++, N_ is a no-op.  Define it if the user didn't.
Suggested by Frank Heckenbach.
https://lists.gnu.org/r/bug-bison/2020-04/msg00010.html

* src/output.c (prepare_symbol_names): Rename has_translations as
has_translations_flag.
* data/skeletons/bison.m4 (b4_has_translations_if): New.
* data/skeletons/java.m4 (b4_trans): Use it.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c
(N_): Provide a default definition.
2020-04-20 07:37:45 +02:00
Akim Demaille
9b7e7077dd style: fix comments
* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c: here.
2020-04-19 15:40:12 +02:00
Akim Demaille
caadfc552b skeletons: use symbol(-2, kind)
Not all the symbols have a fixed symbol code.  UNDEF's one is fixed:
-2.

* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c: here.
2020-04-16 07:35:06 +02:00
Akim Demaille
c4c25e091c style: comments changes about error handling
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: here.
* data/skeletons/lalr1.cc: Reduce scope.
2020-04-16 07:34:37 +02:00
Akim Demaille
64aec0a8d8 c, c++: also define YYEMPTY in yytoken_kind_t
I have been hesitating a lot before doing it ---after all the user
must not use this kind, so what's the point of showing it in
yytoken_kind_t.  And eventually I chose to play it safe with the
typing system and make it possible to use yytoken_kind_t for all the
tokens, even the "empty token".

* data/skeletons/c.m4: Give an id and a tag to YYEMPTY.
(b4_token_enums): Define YYEMPTY.
* data/skeletons/c++.m4 (b4_token_enums): Define YYEMPTY.
* data/skeletons/glr.c, data/skeletons/glr.cc, data/skeletons/yacc.c:
(YYEMPTY): Remove.
Use b4_symbol(-2, id) instead.
2020-04-13 16:49:48 +02:00
Akim Demaille
5839f4d289 c: rename yyexpected_tokens as yypcontext_expected_tokens
The user should think of yypcontext fields as accessible only via
yypcontext_* functions.  So let's rename yyexpected_tokens to reflect
that.

Let's _not_ rename yyreport_syntax_error, as the user may define this
function, and is not allowed to access directly the fields of
yypcontext_t: she *must* use the "accessors".  This is comparable to
the case of C++/Java where the user defines
parser::report_syntax_error, not parser::context::report_syntax_error.

* data/skeletons/glr.c, data/skeletons/yacc.c (yyexpected_tokens):
Rename as...
(yypcontext_expected_tokens): this.
Adjust dependencies.
2020-04-12 19:23:40 +02:00
Akim Demaille
e50de09886 tokens: properly define the YYEOF token kind
Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file.  As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.

Give the $end token a visible kind name, YYEOF.  Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.

* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
2020-04-12 13:56:44 +02:00
Akim Demaille
a4ed94bc13 tokens: properly define the "error" token kind
There are people out there that do use YYERRCODE (the token kind of
the error token).  See for instance
3812012bb7/unixODBC-2.3.2/Drivers/nn/yylex.c.

Currently, YYERRCODE is defined by yacc.c in an adhoc way as a #define
in the *.c file only.  It belongs with the other token kinds.

YYERRCODE is not a nice name, it does not fit in our naming scheme.
YYERROR would be more logical, but it collides with the YYERROR macro.
Shall we keep the same name in all the skeletons?  Besides, to avoid
collisions in C, we need to apply the api prefix: YYERRCODE is
actually <PREFIX>ERRCODE.  This is not needed in the other languages.

* data/skeletons/bison.m4 (b4_symbol_token_kind): New.
Map the error token to "YYERRCODE".
* data/skeletons/yacc.c (YYERRCODE): Don't define it, it's handled by...
* src/output.c (prepare_symbol_definitions): this.
* tests/input.at (Redefining the error token): Check it.
2020-04-12 13:56:43 +02:00
Akim Demaille
8dcc25a1e4 style: rename YYNOMEM as YYENOMEM
This is clearer.

* data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): Rename as...
(YYENOMEM): here.
2020-04-10 18:35:29 +02:00
Akim Demaille
bbb9750b3e skeletons: introduce api.symbol.prefix
* data/skeletons/bison.m4 (b4_symbol_prefix): New.
(b4_symbol_kind): Use it.
* data/skeletons/c++.m4, data/skeletons/c.m4, data/skeletons/d.m4
* data/skeletons/java.m4 (api.symbol.prefix): Provide a default value.

* data/skeletons/glr.c, data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java, data/skeletons/yacc.c:
Adjust: use b4_symbol_prefix instead of YYSYMBOL_.
2020-04-07 08:40:16 +02:00
Akim Demaille
87579e03e0 skeletons: beware not to use yyarg when it's null
Reported by Adrian Vogelsgesang.

* data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Here.
2020-04-06 19:14:11 +02:00
Akim Demaille
f0bb82ae9e skeletons: use consistently "kind" instead of "type" in the code
* data/skeletons/bison.m4, data/skeletons/c++.m4, data/skeletons/c.m4,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java:
Refer to the "kind" of a symbol, not its "type", where appropriate.
2020-04-05 19:14:39 +02:00
Akim Demaille
2c05fc750a c, c++: rename yysymbol_type_t as yysymbol_kind_t
See https://lists.gnu.org/r/bison-patches/2020-04/msg00031.html

* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/yacc.c
(yysymbol_type_t): Rename as...
(yysymbol_kind_t): this.
Adjust dependencies.
* data/skeletons/c++.m4, data/skeletons/glr.cc, data/skeletons/lalr1.cc
(symbol_type_type): Rename as...
(symbol_kind_type): this.
Adjust dependencies.
2020-04-05 14:56:18 +02:00
Akim Demaille
4e26809ab9 style: rename yysyntax_error_arguments as yy_syntax_error_arguments
It's a private implementation detail.

* NEWS, data/skeletons/glr.c, data/skeletons/lalr1.cc,
* data/skeletons/yacc.c, doc/bison.texi: here.
2020-04-05 08:56:21 +02:00
Akim Demaille
76e11b5a3e c: rename yyparse_context_t as yypcontext_t
The first name is too long.  We already have `yypstate`, so
`yypcontext` is ok.  We are also migrating to using `*_t` for our
types.

* NEWS, data/skeletons/glr.c, data/skeletons/yacc.c, doc/bison.texi,
* examples/c/bistromathic/parse.y, src/parse-gram.y, tests/local.at:
(yyparse_context_t, yyparse_context_location, yyparse_context_token):
Rename as...
(yypcontext_t, yypcontext_location, yypcontext_token): these.
2020-04-04 19:20:29 +02:00
Akim Demaille
086506bf23 glr.c, yacc.c: propagate yysymbol_type_t
Now that yacc.c and glr.c both know yysymbol_type_t, convert the
common routines.

* data/skeletons/c.m4 (yydestruct, yy_symbol_value_print)
(yy_symbol_print): Use yysymbol_type_t instead of int.
* data/skeletons/glr.c: Use yySymbol where appropriate.
* data/skeletons/yacc.c (YY_ACCESSING_SYMBOL): New wrapper around
yystos.
Use it.
* tests/local.at (yyreport_syntax_error): Use yysymbol_type_t where
appropriate.
2020-04-01 08:31:48 +02:00
Akim Demaille
9039c571f4 yacc.c: fix more errors from make maintainer-check-g++
* data/skeletons/yacc.c (yyexpected_tokens): Use casts where needed.
2020-04-01 08:31:48 +02:00
Akim Demaille
9434571f95 yacc.c: revert to not using yysymbol_type_t in the yytranslate table
This triggers warnings with several compilers.  For instance ICC fills
the logs with pages and pages of

    input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}"
             0,     2,     2,     2,     2,     2,     2,     2,     2,     2,
             ^

    input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}"
             0,     2,     2,     2,     2,     2,     2,     2,     2,     2,
                    ^

And so does G++9 when compiling yacc.c's (C) output

    input.c:545:8: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive]
      545 |        0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
          |        ^
          |        |
          |        int
    input.c:545:15: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive]
      545 |        0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
          |               ^
          |               |
          |               int

Clang++ is no exception

    input.c:545:8: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int'
           0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
           ^
    input.c:545:15: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int'
           0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
                  ^

At some point we could use yysymbol_type_t's enumerators to define
yytranslate.  Meanwhile...

* data/skeletons/yacc.c (yytranslate): Use the original integral type
to define it.
(YYTRANSLATE): Cast the result into yysymbol_type_t.
2020-04-01 08:31:48 +02:00
Akim Demaille
75a605454d yacc.c: prefer YYSYMBOL_YYERROR to YYSYMBOL_error
* data/skeletons/bison.m4 (b4_symbol_sid): Map "error" to YYSYMBOL_YYERROR.
* data/skeletons/yacc.c: Adjust.
2020-04-01 08:31:48 +02:00
Akim Demaille
f3c18c8e80 yacc.c: also define a symbol number for the empty token
This is not only cleaner, it also protects us from mixing signed
values (YYEMPTY is #defined as -2) with unsigned types (the
yysymbol_type_t enum is typically compiled as a small unsigned).
For instance GCC 9:

    input.c: In function 'yyparse':
    input.c:1107:7: error: conversion to 'unsigned int' from 'int'
                           may change the sign of the result
                           [-Werror=sign-conversion]
     1107 |   yyn += yytoken;
          |       ^~
    input.c:1107:10: error: conversion to 'int' from 'unsigned int'
                            may change the sign of the result
                            [-Werror=sign-conversion]
     1107 |   yyn += yytoken;
          |          ^~~~~~~
    input.c:1108:47: error: comparison of integer expressions of
                            different signedness:
                            'yytype_int8' {aka 'const signed char'} and
                            'yysymbol_type_t' {aka 'enum yysymbol_type_t'}
                            [-Werror=sign-compare]
     1108 |   if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken)
          |                                               ^~
    input.c:702:25: error: operand of ?: changes signedness from 'int'
                           to 'unsigned int' due to unsignedness of
                           other operand [-Werror=sign-compare]
      702 | #define YYEMPTY         (-2)
          |                         ^~~~
    input.c:1220:33: note: in expansion of macro 'YYEMPTY'
     1220 |   yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar);
          |                                 ^~~~~~~
    input.c:1220:41: error: unsigned conversion from 'int' to
                            'unsigned int' changes value
                            from '-2' to '4294967294'
                            [-Werror=sign-conversion]
     1220 |   yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar);
          |                                         ^

Eventually, it might be interesting to move away from -2 (which is the
only possible negative symbol number) and use the next available
number, to save bits.  We could actually even simply use "0" and shift
the rest, which would allow to write "!yytoken" to mean really
"yytoken != YYEMPTY".

* data/skeletons/c.m4 (b4_declare_symbol_enum): Define YYSYMBOL_YYEMPTY.
* data/skeletons/yacc.c: Use it.

* src/parse-gram.y (yyreport_syntax_error): Use YYSYMBOL_YYEMPTY, not
YYEMPTY, when dealing with a symbol.

* tests/regression.at: Adjust.
2020-04-01 08:31:48 +02:00
Akim Demaille
00c80bc96c yacc.c: use yysymbol_type_t instead of int for yytoken
Now that we have a proper type for internal symbol numbers, let's use
it.  More code needs conversion, e.g., printers and destructors, but
they are shared with glr.c, which is not ready yet for this change.

It will also help us deal with warnings such as (GCC9 on GNU/Linux):

    input.c: In function 'int yyparse()':
    input.c:475:37: error: enumeral and non-enumeral type in conditional expression [-Werror=extra]
      475 |   (0 <= (YYX) && (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYSYMBOL_YYUNDEF)
          |    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    input.c:1024:17: note: in expansion of macro 'YYTRANSLATE'
     1024 |       yytoken = YYTRANSLATE (yychar);
          |                 ^~~~~~~~~~~

* data/skeletons/yacc.c (yytranslate, yysymbol_name)
(yyparse_context_t, yyexpected_tokens, yypstate_expected_tokens)
(yysyntax_error_arguments):
Use yysymbol_type_t instead of int.
2020-04-01 08:31:48 +02:00
Akim Demaille
3ba001baac yacc.c: introduce an enum that defines the symbol's number
There's a number of advantage in exposing the symbol (internal)
numbers:

- custom error messages can use them to decide how to represent a
  given symbol, or a set of symbols.

- we need something similar in uses of yyexpected_tokens.  For
  instance, currently, bistromathic's completion() reads:

    int ntokens = expected_tokens (line, tokens, YYNTOKENS);
    [...]
    for (int i = 0; i < ntokens; ++i)
      if (tokens[i] == YYTRANSLATE (TOK_VAR))
      [...]
      else if (tokens[i] == YYTRANSLATE (TOK_FUN))
      [...]
      else
      [...]

- now that it's a compile-time expression, we can easily build static
  tables, switch, etc.

- some users depended on the ability to get the token number from a
  symbol to write test cases for their scanners.  But Bison 3.5
  removed the table this feature depended upon (a reverse
  yytranslate).  Now they can check against the actual symbol number,
  without having pay (space and time) a conversion.
  See https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html, and
  https://lists.gnu.org/archive/html/bug-bison/2020-03/msg00015.html.

- it helps us clearly separate the internal symbol numbers from the
  external token numbers, whose difference is sometimes blurred in the
  code when values coincide (e.g. "yychar = yytoken = YYEOF").

- it allows us to get rid of ugly macros with inconsistent names such
  as YYUNDEFTOK and YYTERROR, and to group related definitions
  together.

- similarly it provides a clean access to the $accept symbol (which
  proves convenient in a current experimentation of mine with several
  %start symbols).

Let's declare this type as a private type (in the *.c file, not
the *.h one).  So it does not need to be influenced by the api prefix.

* data/skeletons/bison.m4 (b4_symbol_sid): New.
(b4_symbol): Use it.
* data/skeletons/c.m4 (b4_symbol_enum, b4_declare_symbol_enum): New.
* data/skeletons/yacc.c: Use b4_declare_symbol_enum.
(YYUNDEFTOK, YYTERROR): Remove.
Use the corresponding symbol enum instead.
2020-04-01 08:31:33 +02:00