symbol_type::token () was removed: it returned the token kind of a
symbol. To do that, one needs to convert from the symbol kind to the
token kind, which requires a table.
This broke some users' unit tests for scanners, see
https://lists.gnu.org/r/bug-bison/2020-01/msg00001.htmlhttps://lists.gnu.org/r/bug-bison/2020-03/msg00020.htmlhttps://lists.gnu.org/r/help-bison/2020-04/msg00005.html
Instead of making this possible again, let's check the symbol's kind
instead. So give proper access to a symbol's kind.
That feature existed, undocumented, as 'type_get()'. Let's rename
this as 'kind()'.
* data/skeletons/c++.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc (type_get): Rename as...
(kind): This.
(type_get): Install a backward compatibility alias.
* doc/bison.texi (Complete Symbols): Document symbol_type and
symbol_type::kind.
* data/skeletons/c++.m4: Define the old names in terms on the new
ones, instead of the converse.
* doc/bison.texi (C++ Parser Interface): Be more extensive about
token_kind_type.
Why didn't I think about this before??? symbolName should be a method
of SymbolKind.
* data/skeletons/lalr1.java (YYParser::yysymbolName): Move as...
* data/skeletons/java.m4 (SymbolKind::getName): this.
Make the table a static final table, not a local variable.
Adjust dependencies.
* doc/bison.texi (Java Parser Interface): Document i18n.
(Java Parser Context Interface): Document SymbolKind.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
* doc/bison.texi (C++ Parser Context): New.
* data/skeletons/lalr1.cc (parser::yysymbol_name): Rename as...
(parser::symbol_name): this.
(A Complete C++ Example): Promote LAC, now that we have it.
Promote parse.error detailed over verbose.
* examples/c++/calc++/calc++.test, tests/local.at: Adjust.
* NEWS (Deep overhaul of the symbol and token kinds): New.
* doc/bison.texi: Promote YYEOF over "0" in scanners.
(Token Decl): No longer show YYEOF here, it now works by default.
(Token I18n): More details about YYEOF here.
(Calc++): Just use YYEOF.
* doc/bison.texi: Replace occurrences of "token type" with "token
kind".
Stop referring to the "macro definitions" of the token kinds, just
name them "definitions".
The first name is too long. We already have `yypstate`, so
`yypcontext` is ok. We are also migrating to using `*_t` for our
types.
* NEWS, data/skeletons/glr.c, data/skeletons/yacc.c, doc/bison.texi,
* examples/c/bistromathic/parse.y, src/parse-gram.y, tests/local.at:
(yyparse_context_t, yyparse_context_location, yyparse_context_token):
Rename as...
(yypcontext_t, yypcontext_location, yypcontext_token): these.
* doc/bison.texi (Parser Internationalization): Move most of its
content into...
(Enabling I18n): this new node.
(Token I18n): New.
(Token Decl): Refer to token internationalization.
(Error Reporting Function): Promote parse.error detailed.
* doc/bison.texi (Tokens from Literals): Move to code using
%token-table to...
(Decl Summary: %token-table): here.
* data/skeletons/bison.m4: Implement mutual exclusion.
* tests/input.at: Check it.
* doc/local.mk: Be robust to the removal of doc/.
With texinfo.tex 2019-09-24.13, node names with + are not properly
handled.
https://lists.gnu.org/r/bug-texinfo/2020-02/msg00004.html
* doc/bison.texi: Always use the three-argument form for references to
node with a + in the name.
The Java skeleton displays
Reading a token:
Next token is token "number" (1)
while the other display
Reading a token: Next token is token "number" (1)
When generating logs in the scanner, the first part is separated from
the second, and the end of the scanner logs have the second part
pasted in. So let's propagate the Java way, but with the colon.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java, data/skeletons/yacc.c: Do it.
Adjust test cases and doc.
The PDF output is more consistent: some nodes were not pointed to with
their title. The HTML output becomes "see section Foo" instead of
"see Foo", but this should be addressed in the next Texinfo release.
Info output is simplified, as it uses only the node name and not its
title. But it's considered easier to read this way.
See https://lists.gnu.org/r/help-texinfo/2020-01/msg00031.html.
* doc/bison.texi: Set @xrefautomaticsectiontitle on.
Simplify all uses of ref.
* data/skeletons/glr.c (YYASSERT): Rename as...
(YY_ASSERT): this, for consistency with yacc.c, and also to emphasize
the fact that this is not for the end user (YY_ prefix).
* tests/glr-regression.at: Define parse.assert.
String literals as tokens serve two distinct purposes: freeing from
having to implement the keyword matching in the scanner, and improving
error messages. Most of the time both can be achieved at the same
time, but on occasions, it does not work so well.
We promote their use for error messages. We will also still support
the former case, but it is _not_ the recommended approach.
* doc/bison.texi (Tokens from Literals): Clearly state that we don't
recommend looking up the token types in the list of token names.
String literals, which allow for better error messages, are (too)
liberally accepted by Bison, which might result in silent errors. For
instance
%type <exVal> cond "condition"
does not define “condition” as a string alias to 'cond' (nonterminal
symbols do not have string aliases). It is rather equivalent to
%nterm <exVal> cond
%token <exVal> "condition"
i.e., it gives the type 'exVal' to the "condition" token, which was
clearly not the intention.
Introduce -Wdangling-alias to catch this.
* src/complain.h, src/complain.c: Add support for -Wdangling-alias.
(argmatch_warning_args): Sort.
* src/symtab.c (symbol_check_defined): Complain about dangling
aliases.
* doc/bison.texi: Document it.
* tests/input.at (Dangling aliases): New test.
As an extension to POSIX Yacc, Bison's %type accepts tokens.
Unfortunately with string literals as implicit tokens, this is
misleading, and led some users to write
%type <exVal> cond "condition"
believing that "condition" would be associated to the 'cond'
nonterminal (see https://github.com/apache/httpd/pull/72).
* doc/bison.texi: Promote %nterm rather than %type to declare the type
of nonterminals.
We still have a few old C casts in lalr1.cc, let's get rid of them.
Reported by Frank Heckenbach.
Actually, let's monitor all our casts using easy to grep macros.
Let's use these macros to use the C++ standard casts when we are in
C++.
* data/skeletons/c.m4 (b4_cast_define): New.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/stack.hh,
* data/skeletons/yacc.c:
Use it and/or its casts.
* tests/actions.at, tests/cxx-type.at,
* tests/glr-regression.at, tests/headers.at, tests/torture.at,
* tests/types.at:
Use YY_CAST instead of C casts.
* configure.ac (warn_cxx): Add -Wold-style-cast.
* doc/bison.texi: Disable it.
This changes the Yacc skeleton to use “least” integer types to
keep tables smaller on some platforms, which should lessen cache
pressure. Since Bison uses the Yacc skeleton, it follows suit.
* data/skeletons/yacc.c: Include limits.h and stdint.h if this
seems to be needed.
(yytype_uint8, yytype_int8, yytype_uint16, yytype_int16):
If available, use GCC predefined macros __INT_MAX__ etc. to select
a “least” type, as this avoids namespace hassles. Otherwise, if
available fall back on selecting a “least” type via the C99 macros
INT_MAX, INT_LEAST8_MAX, etc. Otherwise, fall further back on one of
the builtin C99 types signed char, short, and int. Make sure that
any selected type promotes to int. Ignore any macros YYTYPE_INT16,
YYTYPE_INT8, YYTYPE_UINT16, YYTYPE_UINT8 defined by the user.
(ptrdiff_t, PTRDIFF_MAX): Simplify in the light of the above.
(yytype_uint8, yytype_uint16): Do not assume that unsigned char
and unsigned short promote to int, as this isn’t true on some
platforms (e.g., TI TMS320C55x).
* src/parse-gram.y (YYTYPE_INT16, YYTYPE_INT8, YYTYPE_UINT16)
(YYTYPE_UINT8): Remove, as these are no longer effective.