The CI has "failures" such as (253, "Null nonterminals"):
@@ -21,7 +21,7 @@
3: 3 b: . %empty
3: 4 c: . %empty
On Symbols: {A,}
-time limit exceeded: 6.000000
+time limit exceeded: 11.000000
First Example c • c A A $end
First derivation $accept ::=[ a ::=[ c d ::=[ a ::=[ b ::=[ • ] d ::=[ c A A ] ] ] ] $end ]
Second Example c • A $end
* tests/counterexample.at (AT_BISON_CHECK_CEX): New.
Use it to neutralize differences in timeout values.
* src/parse-simulation.c: Replace reference counting with
parse_state_retain everywhere.
(free_parse_state): Make this function iterative instead of
recursive. Long parse_state chains were causing stack exhaustion.
* tests/counterexample.at: Fix expectations.
Fixes the SEGV in test 247 (counterexample.at:195): "S/R after first
token".
* src/counterexample.c: here.
* tests/counterexample.at: Fix expectations.
* src/counterexample.c, src/derivation.c:
Do not output diagnostics on stdout, that's the job of stderr, and the
testsuite heavily depend on this.
Do not leave trailing spaces in the output.
* tests/counterexample.at: Use AT_KEYWORDS.
Specify the expected outputs.
* tests/local.mk: Add counterexample.at.
In Bison 3.6.2, the comments with brackets lose their brackets, for
improper m4 quotation.
* data/skeletons/bison.m4 (b4_gsub): New.
* data/skeletons/c-like.m4 (_b4_comment): Use it.
* tests/m4.at: Check b4_gsub.
With input such as
%token<fl> yVL_CLOCK "/*verilator sc_clock*/"
we generate
yVL_CLOCK = 610, /* "/*verilator sc_clock*/" */
which is invalid since the comment will actually be closed on the
first "*/". Let's turn "*/" into "*\/" to avoid this. But GCC will
also warn about "/*" inside a comment, so let's "escape" it too.
Reported by Huang Rui.
https://github.com/akimd/bison/issues/38
* data/skeletons/c-like.m4 (_b4_comment): Escape comment delimiters in
comments.
* tests/input.at (Torturing the Scanner): Check thes cases.
* tests/m4.at: New.
Reported by Martin Blais <blais@furius.ca>.
https://lists.gnu.org/r/help-bison/2020-05/msg00005.html
* data/skeletons/lalr1.cc (symbol_name): Make it public.
Add a private hidden hook to enable testing of private parts.
* tests/local.at (AT_DATA_GRAMMAR_PROLOGUE): Help Emacs find the right
language mode.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check that we
can read symbol_name.
AIX 7.1 supports diff -u, but its output does not match the expected
one.
Reported by Bruno Haible.
https://lists.gnu.org/r/bug-bison/2020-05/msg00049.html
* tests/atlocal.in (DIFF_U_WORKS): New.
* tests/local.at (AT_DIFF_U_CHECK): New.
* tests/existing.at (_AT_TEST_EXISTING_GRAMMAR): Use AT_DIFF_U_CHECK.
I don't plan to fix everything in one go. But this was in the way of
the next commit.
* data/skeletons/lalr1.java: Avoid space before parens.
* tests/java.at: Adjust.
From
public interface Lexer {
/* Token kinds. */
/** Token number, to be returned by the scanner. */
static final int YYEOF = 0;
/** Token number, to be returned by the scanner. */
static final int YYERRCODE = 256;
/** Token number, to be returned by the scanner. */
static final int YYUNDEF = 257;
/** Token number, to be returned by the scanner. */
static final int BANG = 258;
...
/** Deprecated, use b4_symbol(0, id) instead. */
public static final int EOF = YYEOF;
to
public interface Lexer {
/* Token kinds. */
/** Token "end of file", to be returned by the scanner. */
static final int YYEOF = 0;
/** Token error, to be returned by the scanner. */
static final int YYerror = 256;
/** Token "invalid token", to be returned by the scanner. */
static final int YYUNDEF = 257;
/** Token "!", to be returned by the scanner. */
static final int BANG = 258;
...
/** Deprecated, use YYEOF instead. */
public static final int EOF = YYEOF;
* data/skeletons/java.m4 (b4_token_enum): Display the symbol's tag in
comment.
* data/skeletons/lalr1.java: Address overquotation issue.
* examples/java/calc/Calc.y, examples/java/simple/Calc.y: Use YYEOF,
not EOF.
On an invalid character literal such as "'\777'" we used to produce
two errors:
input.y:2.9-12: error: invalid number after \-escape: 777
input.y:2.8-13: error: empty character literal
Get rid of the second one.
* src/scan-gram.l (STRING_GROW_ESCAPE): New.
* tests/input.at: Adjust.
I'm quite pleased to see that the tricky case of glr.c was already
prepared by the changes to support syntax_error exceptions. Better
yet, it is actually syntax_error that becomes a special case of the
general pattern: make yytoken be YYERRCODE.
* data/skeletons/glr.c (YYFAULTYTOK): Remove the now useless (Basil)
Faulty token.
Instead, use the error token.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: When computing
the action, first check the case of the error token.
* tests/calc.at: Check cases for the error token symbols before and
after it.
* data/skeletons/yacc.c (yyparse): When the scanner returns YYERRCODE,
go directly to error recovery (yyerrlab1).
However, don't keep the error token as lookahead, that token is too
special.
* data/skeletons/lalr1.cc: Likewise.
* examples/c/bistromathic/parse.y (yylex): Use that feature to report
nicely invalid characters.
* examples/c/bistromathic/bistromathic.test: Check that.
* examples/test: Neutralize gratuitous differences such as rule
position.
* tests/calc.at: Check that case in C only.
The other case seem to be working, but that's an illusion that the
next commit will address (in fact, they can enter endless loops, and
report the error several times anyway).
We will not keep YYERRCODE anyway, it causes backward compatibility
issues. So as a first step, let all the skeletons use that name,
until we have a better one.
* data/skeletons/bison.m4, data/skeletons/glr.c,
* data/skeletons/glr.cc, data/skeletons/lalr1.cc,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/yacc.c, doc/bison.texi, tests/headers.at,
* tests/input.at:
here.
On macOS, wc -l always prepends the result with a tab, even when fed
by stdin. But anyway, we should have used `grep -c -v`, which appears
to be portable according to Autoconf's "Limitations of Usual Tools"
section.
Reported by Denis Excoffier.
https://lists.gnu.org/r/bug-bison/2020-04/msg00009.html
* tests/calc.at (_AT_CHECK_CALC): Use grep's -c instead.
Why didn't I think about this before??? symbolName should be a method
of SymbolKind.
* data/skeletons/lalr1.java (YYParser::yysymbolName): Move as...
* data/skeletons/java.m4 (SymbolKind::getName): this.
Make the table a static final table, not a local variable.
Adjust dependencies.
* doc/bison.texi (Java Parser Interface): Document i18n.
(Java Parser Context Interface): Document SymbolKind.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
* doc/bison.texi (C++ Parser Context): New.
* data/skeletons/lalr1.cc (parser::yysymbol_name): Rename as...
(parser::symbol_name): this.
(A Complete C++ Example): Promote LAC, now that we have it.
Promote parse.error detailed over verbose.
* examples/c++/calc++/calc++.test, tests/local.at: Adjust.
The user should think of yypcontext fields as accessible only via
yypcontext_* functions. So let's rename yyexpected_tokens to reflect
that.
Let's _not_ rename yyreport_syntax_error, as the user may define this
function, and is not allowed to access directly the fields of
yypcontext_t: she *must* use the "accessors". This is comparable to
the case of C++/Java where the user defines
parser::report_syntax_error, not parser::context::report_syntax_error.
* data/skeletons/glr.c, data/skeletons/yacc.c (yyexpected_tokens):
Rename as...
(yypcontext_expected_tokens): this.
Adjust dependencies.
The name "$end" is nice in the report, in particular it avoids that
pointed-rules (aka items) be too long. It also helps keeping them
"standard".
But it is bad in error messages, we should report "end of file" (or
maybe "end of input", this is debatable). So, unless the user already
defined the alias for the error token herself, make it "end of file".
It should even be translated if the user already translated some
tokens, so that there is now no strong reason to redefine the $end
token.
* src/output.c (prepare_symbol_names): Issue "end of file" instead of
"$end".
* data/skeletons/lalr1.java (yytnamerr_): Remove the renaming hack.
* build-aux/update-test: Accept files with names containing a "+",
such as c++.at.
* tests/actions.at, tests/c++.at, tests/conflicts.at,
* tests/glr-regression.at, tests/regression.at, tests/skeletons.at:
Adjust.
Yet, don't change the structure identifier to avoid introducing
conflicts in Vincent Imbimbo's PR (which, amusingly enough, is about
conflicts).
* src/symtab.c: here.
* tests/diagnostics.at, tests/input.at: Adjust.
yy::parser features a parse() function, not a yyparse() one.
* data/skeletons/lalr1.cc (yyreport_syntax_error)
(context::yyexpected_tokens): Rename as...
(report_syntax_error, context::expected_tokens): these.
Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in
the implementation file. As a result, the user has to define her own
EOF token if she wants to use it, which is a pity.
Give the $end token a visible kind name, YYEOF. Except that in C,
where enums are not scoped, we would have collisions between all the
definitions of YYEOFs in the header files, so in C, make it
<api.PREFIX>EOF.
* data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions.
Unless the user already gave it a different name.
* data/skeletons/glr.c (YYEOF): Remove.
Use ]b4_symbol(0, [id])[ instead.
Add support for "pre_epilogue", for glr.cc.
* data/skeletons/glr.cc: Remove dead code (never emitted #undefs).
* data/skeletons/yacc.c
* src/parse-gram.c
* src/reader.c
* src/symtab.c
* tests/actions.at
* tests/input.at
* data/skeletons/bison.m4 (b4_symbol_token_kind): Give a definition to
$undefined.
(b4_token_visible_if): $undefined has an id.
* src/output.c (prepare_symbol_definitions): Stop lying: $undefined
_is_ a token.
* tests/input.at: Adjust.
There are people out there that do use YYERRCODE (the token kind of
the error token). See for instance
3812012bb7/unixODBC-2.3.2/Drivers/nn/yylex.c.
Currently, YYERRCODE is defined by yacc.c in an adhoc way as a #define
in the *.c file only. It belongs with the other token kinds.
YYERRCODE is not a nice name, it does not fit in our naming scheme.
YYERROR would be more logical, but it collides with the YYERROR macro.
Shall we keep the same name in all the skeletons? Besides, to avoid
collisions in C, we need to apply the api prefix: YYERRCODE is
actually <PREFIX>ERRCODE. This is not needed in the other languages.
* data/skeletons/bison.m4 (b4_symbol_token_kind): New.
Map the error token to "YYERRCODE".
* data/skeletons/yacc.c (YYERRCODE): Don't define it, it's handled by...
* src/output.c (prepare_symbol_definitions): this.
* tests/input.at (Redefining the error token): Check it.
A forthcoming commit (tokens: properly define the "error" token kind)
revealed a problem in the C++ generated headers: they are not
self-contained. With this file:
%language "c++"
%define api.value.type variant
%code {
static int yylex (yy::parser::semantic_type *lvalp);
}
%token <int> X
%%
exp:
X { printf ("x\n"); }
;
%%
void
yy::parser::error (const std::string& m)
{
std::cerr << m << '\n';
}
static
int yylex (yy::parser::semantic_type *lvalp)
{
static int const input[] = {yy::parser::token::X, 0};
static int toknum = 0;
return input[toknum++];
}
int
main (int argc, char const* argv[])
{
yy::parser p;
return p.parse ();
}
the generated header fails to compile cleanly (foo.cc just #includes
the generated header):
$ clang++-mp-9.0 -c -Wundefined-func-template foo.cc
In file included from foo.cc:1:
bar.tab.hh:550:12: warning: instantiation of function 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' required here, but no definition is available
[-Wundefined-func-template]
struct symbol_type : basic_symbol<by_type>
^
bar.tab.hh:436:7: note: forward declaration of template entity is here
basic_symbol (basic_symbol&& that);
^
bar.tab.hh:550:12: note: add an explicit instantiation declaration to suppress this warning if 'yy::parser::basic_symbol<yy::parser::by_type>::basic_symbol' is explicitly instantiated
in another translation unit
struct symbol_type : basic_symbol<by_type>
^
1 warning generated.
* data/skeletons/c++.m4 (b4_public_types_define): Move the
implementation of the basic_symbol move-ctor to...
(b4_public_types_define): here, its declaration.
* tests/headers.at (Sane headers): Use a declared token so that the
corresponding token constructor is declared. Which triggers the
aforementioned issue.
That's one nice benefit from using enums.
* data/skeletons/lalr1.java (YYSYMBOL_YYEMPTY): No longer define it.
Use 'null' instead.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
* maint:
maint: post-release administrivia
version 3.5.4
examples: reccalc: really compile cleanly in C99
news: announce that Bison 3.6 drops YYERROR_VERBOSE
news: update for 3.5.4
style: fix spellos
typo: succesful -> successful
package: improve the readme
java: check and fix support for api.token.raw
java: style: prefer 'int[] foo' to 'int foo[]'
build: fix syntax-check issues
tests: recheck: work properly when the test suite was interrupted
doc: c++: promote api.token.raw
build: fix compatibility with old compilers
examples: reccalc: compile cleanly in C99