The previous commit ("yacc.c: declare and initialize and the same
time") made b4_initialize_parser_state_variables useless.
* data/skeletons/yacc.c (b4_initialize_parser_state_variables): Inline
into...
(yypstate_clear): here.
In order to factor the code of push and pull parsers, the declaration
of the parser's state variable was common (being local variable in
pull parsers, and struct members in push parsers). This result in
rather poor style in pull parser, with first variable declarations,
and then their initializations.
The initialization is about to differ between push and pull parsers,
so it is no longer worth keeping both cases together.
* data/skeletons/yacc.c (b4_declare_parser_state_variables): Accept an
argument, and when it is set, initialize the variables.
Adjust dependencies.
Currently yypull_parse takes a yypstate* as argument, and accepts it
to be NULL. This does not seem to make a lot of sense: rather it is
its callers that should do that.
I believe this is historical: yypull_parse was introduced
first (c3d503425f), with yyparse being a
macro. So yyparse could hardly deal with memory allocation properly.
In 7172e23e8f that yyparse was turned
into a genuine function. At that point, it should have allocated its
own yypstate*, which would have left yypull_parse deal with only one
single non-null ypstate* argument.
Fortunately, it is nowhere documented that it is valid to pass NULL to
yypull_parse. It is now forbidden.
* data/skeletons/yacc.c (yypull_parse): Don't allocate a yypstate.
Needs a location to issue the error message.
(yyparse): Allocate the yypstate.
* README-hacking.md (Working from the Repository): Make it first to
make it easier to find the instructions to build from the repo.
(Implementation Notes): New.
* README: Provide more links.
Reported by Martin Blais and Yuriy Solodkyy.
https://lists.gnu.org/r/help-bison/2020-05/msg00011.htmlhttps://lists.gnu.org/r/bug-bison/2020-06/msg00038.html
While at it, modernize filename_type as api.filename.type and document
it properly.
* data/skeletons/c++.m4 (filename_type): Rename as...
(api.filename.type): this.
Default to const std::string.
* data/skeletons/location.cc (position, location): Expose the
filename_type type.
Use api.filename.type.
* doc/bison.texi (%define Summary): Document api.filename.type.
(C++ Location Values): Document position::filename_type.
* src/muscle-tab.c (muscle_percent_variable_update): Ensure backward
compatibility.
* tests/c++.at: Check that using const file names is ok.
tests/input.at: Check backward compat.
AFAICT, "dotted rule" is a more frequent synonym of "item" than
"pointed rule". So let's migrate to using "dot" only.
* doc/bison.texi: Use dot/'•' rather than point/'.'.
* src/print-xml.c (print_core): Use dot rather than point. This is
not backward compatible, but AFAICT, we don't have actual user of the
XML output (but ourselves). So...
* data/xslt/xml2dot.xsl, data/xslt/xml2text.xsl,
* data/xslt/xml2xhtml.xsl, tests/report.at: ... adjust.
This was a hack to make it easier for people to migrate from yacc.c to
lalr1.cc and from glr.c to glr.cc: when set, YYSTYPE and YYLTYPE were
`#defined`. It was never documented (just mentioned in NEWS for Bison
2.2, 2006-05-19), but was used to simplify the test suite. Stop that:
adjust the test suite to the skeletons, not the converse.
In C++ use yy::parser::semantic_type, yy::parser::location_type, and
yy::parser::token::MY_TOKEN, instead of YYSTYPE, YYLTYPE and MY_TOKEN.
* data/skeletons/glr.cc, data/skeletons/lalr1.cc: Remove its support.
* tests/actions.at, tests/c++.at, tests/calc.at: Adjust.
* upstream/maint:
maint: post-release administrivia
version 3.6.4
glr.cc: don't leak glr.c/glr.cc scaffolding to the user
Some fixes were needed to adjust to recent changes in glr.cc and
glr.c.
* data/skeletons/glr.cc: Stop messing with the user's epilogue to
insert glr.cc code. We need that code to be inserted _before_ the
user's epilogue, not after. So define b4_glr_cc_pre_epilogue.
* data/skeletons/glr.c: Use it.
Until we have a decent reimplementation of glr.cc, we have to use
tricks to shoehorn C++ symbols to the C engine of glr.c. Some of them
are done via #define. Unfortunately in Bison 3.6 some of these we
done in the header file, which broke valid user code.
Reported by Egor Pugin.
https://lists.gnu.org/r/bug-bison/2020-06/msg00003.html
* data/skeletons/glr.cc: Stop playing tricks with b4_pre_epilogue.
(b4_glr_cc_setup, b4_glr_cc_cleanup): New.
Much cleaner way to instal glr.cc's scaffolding around glr.c.
* data/skeletons/glr.c: Adjust to use them.
While defining api.header.include worked as expected, its default
value was incorrectly defined. As a result, by default, the generated
parsers still duplicated the content of the generated header instead
of including it.
* data/skeletons/yacc.c (api.header.include): Fix its default value.
* tests/output.at: Check it.
* doc/bison.texi (%define Summary): Document api.header.include.
While at it, move the definition of api.namespace at the proper
place.
Use colors to show the counterexamples and the derivations in color,
to highlight their structure. Align the outputs, and add i18n
support. Reduce width by using a one-space separator instead of
two-space.
From
Example A • B C
First derivation s ::=[ a ::=[ A • ] x ::=[ B C ] ]
Second derivation s ::=[ y ::=[ A • B ] c ::=[ C ] ]
to
Example A • B C
First derivation s ::=[ a ::=[ A • ] x ::=[ B C ] ]
Example A • B C
Second derivation s ::=[ y ::=[ A • B ] c ::=[ C ] ]
with colors.
* data/bison-default.css (cex-dot, cex-0, cex-1, cex-2, cex-3, cex-4)
(cex-5, cex-6, cex-7, cex-step, cex-leaf): New.
* src/derivation.c (derivation_print_styled_impl): New.
(derivation_print, derivation_print_leaves): Use it.
* src/counterexample.c: Reformat the output.
* tests/counterexample.at: Adjust.
For instance, in the case of Bison's own parser:
- case 40:
+ case 40: /* grammar_declaration: "%code" "identifier" "{...}" */
{
muscle_percent_code_grow ((yyvsp[-1].ID), (yylsp[-1]),
translate_code_braceless ((yyvsp[0].BRACED_CODE), (yylsp[0])),
(yylsp[0]));
code_scanner_last_string_free ();
}
break;
* data/skeletons/c.m4: Modified.
* data/skeletons/d.m4: Modified.
* data/skeletons/java.m4: Modified.
* src/output.c (output_escaped): New.
(quoted_output): Use it, and rename as...
(output_quoted): this.
Adjust dependencies.
(rule_output): New.
(user_actions_output): Use it.
* data/skeletons/c.m4, data/skeletons/d.m4, data/skeletons/java.m4
(b4_case): Add support for $3, an optional comment.
* upstream/maint:
maint: post-release administrivia
version 3.6.3
build: check -Wmissing-prototypes
tests: show logs
c++: fix printing of state number on streams
pstate_clear is lacking a prototype.
Reported by Ryan
https://lists.gnu.org/r/bug-bison/2020-05/msg00101.html
Besides, none of the C examples were compiled with the warning flags.
* configure.ac (warn_c): Add -Wmissing-prototypes.
* data/skeletons/yacc.c (pstate_clear): Make it static.
* examples/local.mk (TEST_CFLAGS): New.
* examples/c/bistromathic/local.mk, examples/c/calc/local.mk,
* examples/c/lexcalc/local.mk, examples/c/mfcalc/local.mk,
* examples/c/pushcalc/local.mk, examples/c/reccalc/local.mk,
* examples/c/rpcalc/local.mk:
Use it.
GCC's warn_unused_result is not silenced by a cast to void, so we have
to "use" scanf's result.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25509https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425
Flex generated code produces too many warnings, including things such
as, with ICC:
examples/c/lexcalc/scan.c(1088): error #1682: implicit conversion
of a 64-bit integral type to a smaller integral type (potential portability problem)
2259 YY_INPUT( (&YY_CURRENT_BUFFER_LVALUE->yy_ch_buf[number_to_move]),
2260 ^
2261
2262
I am tired of trying to fix Flex's output. The project does not seem
maintained. We ought to avoid it. So, for the time being, don't try
to enable warnings with Flex.
* examples/c/bistromathic/parse.y, examples/c/reccalc/scan.l: Fix
warnings.
* doc/bison.texi: Discard scanf's return value to defeat
-Werror=unused-result.
Teaches bison about a new command line option, --file-prefix-map OLD=NEW
(based on the -ffile-prefix-map option from GCC) which causes it to
replace and file path of OLD in the text of the output file with NEW,
mainly for header guards and comments. The primary use of this is to
make builds reproducible with different input paths, and in particular
the debugging information produced when the source code is compiled. For
example, a distro may know that the bison source code will be located at
"/usr/src/bison" and thus can generate bison files that are reproducible
with the following command:
bison --output=/build/bison/parse.c -d --file-prefix-map=/build/bison/=/usr/src/bison/ parse.y
Importantly, this will change the header guards and #line directives
from:
#ifndef YY_BUILD_BISON_PARSE_H
#line 100 "/build/bison/parse.h"
to
#ifndef YY_USR_SRC_BISON_PARSE_H
#line 100 "/usr/src/bison/parse.h"
which is reproducible.
See https://lists.gnu.org/r/bison-patches/2020-05/msg00016.html
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
* src/files.h, src/files.c (spec_mapped_header_file)
(mapped_dir_prefix, map_file_name, add_prefix_map): New.
* src/getargs.c (-M, --file-prefix-map): New option.
* src/output.c (prepare): Define b4_mapped_dir_prefix and
b4_spec_header_file.
* src/scan-skel.l (@ofile@): Output the mapped file name.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/location.cc,
* data/skeletons/yacc.c:
Adjust.
* doc/bison.texi: Document.
* tests/input.at, tests/output.at: Check.
Instead of generating switch statements with numbers, let's use the
symbol kinds. Not only is this more readable, it also makes reading
diff easier, as a change in symbol numbers won't have such a large
effect on the implementation of symbol actions.
* data/skeletons/bison.m4 (_b4_symbol_case): Use the symbol kind
rather than its number.
This should have been done in 3.6, but I wanted to avoid introducing
conflicts into Vincent's work on counterexamples. It turns out it's
completely orthogonal.
* data/README.md, data/skeletons/bison.m4, data/skeletons/c++.m4,
* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/java.m4,
* data/skeletons/lalr1.d, data/skeletons/lalr1.java,
* data/skeletons/variant.hh, data/skeletons/yacc.c, src/conflicts.c,
* src/derives.c, src/gram.c, src/gram.h, src/output.c,
* src/parse-gram.c, src/parse-gram.y, src/print-xml.c, src/print.c,
* src/reader.c, src/symtab.c, src/symtab.h, tests/input.at,
* tests/types.at:
s/user_token_number/code/g.
Plus minor changes.
In Bison 3.6.2, the comments with brackets lose their brackets, for
improper m4 quotation.
* data/skeletons/bison.m4 (b4_gsub): New.
* data/skeletons/c-like.m4 (_b4_comment): Use it.
* tests/m4.at: Check b4_gsub.
Let --trace=m4-early dump all the logs from the start (as --trace=m4
used to do), and have --trace=m4 now start traces only when actually
working of the user's grammar.
Can make a big difference in the case of small inputs. E.g.
$ bison -S tests/testsuite.dir/001/input.m4 tests/testsuite.dir/001/input.y --trace=m4 |& wc
3952 19446 251068
$ bison -S tests/testsuite.dir/001/input.m4 tests/testsuite.dir/001/input.y --trace=m4-early |& wc
19491 131904 1830495
* data/skeletons/traceon.m4: New.
* src/getargs.h, src/getargs.c: Introduce --trace=m4-early.
* src/output.c (output_skeleton): Adjust for --trace=m4 and --trace=m4-early.
With input such as
%token<fl> yVL_CLOCK "/*verilator sc_clock*/"
we generate
yVL_CLOCK = 610, /* "/*verilator sc_clock*/" */
which is invalid since the comment will actually be closed on the
first "*/". Let's turn "*/" into "*\/" to avoid this. But GCC will
also warn about "/*" inside a comment, so let's "escape" it too.
Reported by Huang Rui.
https://github.com/akimd/bison/issues/38
* data/skeletons/c-like.m4 (_b4_comment): Escape comment delimiters in
comments.
* tests/input.at (Torturing the Scanner): Check thes cases.
* tests/m4.at: New.
c.m4 contains a definition of _Noreturn which is modeled after
gnulib's lib/_Noreturn.h. The latter was recently
changed (b61bf2f0e8) to not using
[[noreturn]] at all, because the uses of _Noreturn in gnulib are
sometimes incompatible with the rules of [[noreturn]].
As a result glr.cc started to use _Noreturn in C++, which clang
refuses (all the glr.cc tests currently fail with Clang++).
* data/skeletons/c.m4 (b4_attribute_define): Restore the definition of
_Noreturn as [[noreturn]] in modern C++.
The generated code uses _Noreturn in places where [[noreturn]] is
valid.
* maint:
news: update
maint: post-release administrivia
version 3.6.1
c++: style: reorder generated code
c++: provide yy::parser::symbol_type::name
c++: make parser::symbol_name public
examples: beware of ~/.inputrc
build: also provide lzip compressed tarballs
style: minor fixes
yacc.c: restore ansi-c compatibility
Reported by Paul Eggert.
* src/getargs.c: We don't need it anyway, since we use _Noreturn.
* data/skeletons/c.m4: While at it, update the definition of _Noreturn
stolen from gnulib.
The implementation of yy::parser::symbol_name is emitted even before
the implementation of yy::parser::parser. This makes little sense.
* data/skeletons/lalr1.cc (symbol_name): Move its implementation in
the same place as in the class definition: after "error" and before
"context".
Reported by Martin Blais <blais@furius.ca>.
https://lists.gnu.org/r/help-bison/2020-05/msg00005.html
* data/skeletons/lalr1.cc (symbol_name): Make it public.
Add a private hidden hook to enable testing of private parts.
* tests/local.at (AT_DATA_GRAMMAR_PROLOGUE): Help Emacs find the right
language mode.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check that we
can read symbol_name.
The user gives yyexpected_tokens a limit: the max number of tokens she
wants to hear about. That's because an error message that reports a
bazillion of possible tokens is useless.
In that case yyexpected_tokens returned 0, so the user would not know
if there are too many expected tokens or none (yes, that's possible).
There are several ways to tell the user in which situation she's in:
- return some E2MANY, a negative value. Then it makes the pattern
int argsize = yypcontext_expected_tokens (ctx, arg, ARGS_MAX);
if (argsize < 0)
return argsize;
no longer valid, as for E2MANY (i) the user must generate the error
message anyway, and (ii) she should not return E2MANY
- return ARGS_MAX + 1. Then it makes it dangerous for the user, as
she has to iterate update `min (ARGS_MAX, argsize)`.
Returning 0 is definitely simpler and safer for the user, as it tells
her "this is not an error, just generate your message without a list
of expecting tokens". So let's still return 0, but set arg[0] to the
empty token when the list is really empty.
* data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/lalr1.java
* data/skeletons/yacc.c (yyexpected_tokens): Put the empty symbol
first if there are no possible tokens at all.
* examples/c/bistromathic/parse.y: Demonstrate how to use that.
For instance test 386, "glr.cc api.value.type={double}":
types.at:366: $CXX $CXXFLAGS $CPPFLAGS $LDFLAGS -o test test.cc $LIBS
stderr:
test.cc: In function 'ptrdiff_t yysplitStack(yyGLRStack*, ptrdiff_t)':
test.cc:490:4: error: 'PTRDIFF_MAX' was not declared in this scope
(PTRDIFF_MAX < SIZE_MAX ? PTRDIFF_MAX : YY_CAST (ptrdiff_t, SIZE_MAX))
^
test.cc:1805:37: note: in expansion of macro 'YYSIZEMAX'
ptrdiff_t half_max_capacity = YYSIZEMAX / 2 / state_size;
^~~~~~~~~
test.cc:490:4: note: suggested alternative: '__PTRDIFF_MAX__'
(PTRDIFF_MAX < SIZE_MAX ? PTRDIFF_MAX : YY_CAST (ptrdiff_t, SIZE_MAX))
^
test.cc:1805:37: note: in expansion of macro 'YYSIZEMAX'
ptrdiff_t half_max_capacity = YYSIZEMAX / 2 / state_size;
^~~~~~~~~
The failing tests are using glr.cc only, which I don't understand, the
problem is rather in glr.c, so I would expect glr.c tests to also fail.
Reported by Bruno Haible.
https://lists.gnu.org/archive/html/bug-bison/2020-05/msg00053.html
* data/skeletons/yacc.c: Move the block that defines
YYPTRDIFF_T/YYPTRDIFF_MAXIMUM, YYSIZE_T/YYSIZE_MAXIMUM, and
YYSIZEOF to...
* data/skeletons/c.m4 (b4_sizes_types_define): Here.
(b4_c99_int_type): Also take care of the #undefinition of short.
* data/skeletons/yacc.c, data/skeletons/glr.c: Use
b4_sizes_types_define.
* data/skeletons/glr.c: Adjust to use YYPTRDIFF_T/YYPTRDIFF_MAXIMUM,
YYSIZE_T/YYSIZE_MAXIMUM.
* data/skeletons/lalr1.java (Location): Make it a static class.
(Lexer.yylex, Lexer.getLVal, Lexer.getStartPos, Lexer.getEndPos):
These are not needed in push parsers.
* examples/java/calc/Calc.y: Demonstrate push parsers in the Java.
* doc/bison.texi: Push parsers have been supported for a long time,
remove incorrect statements stating the opposite.
To write unit tests for their scanners, some users depended on
symbol_type::token():
Lexer lex("12345");
symbol_type t = lex.nextToken();
assert(t.token() == token::INTLIT);
assert(t.value.as<int>() == 12345);
But symbol_type::token() was removed in Bison 3.5 because it relied on
a conversion table. So users had to find other patterns, such as
assert(t.type_get() == by_type(token::INTLIT).type_get());
which relies on several private implementation details.
As part of transitioning from "token type" to "token kind", and making
this a public and documented interface, "by_type" was renamed
"by_kind" and "type_get()" was renamed as "kind()". The latter had
backward compatibility mechanisms, not the former.
In Bison 3.6 none of this should be used, but rather
assert(t.kind() == symbol_kind::S_INTLIT);
Reported by Pramod Kumbhar.
https://lists.gnu.org/r/bug-bison/2020-05/msg00012.html
* data/skeletons/c++.m4 (by_type): Make it an alias to by_kind.
I don't plan to fix everything in one go. But this was in the way of
the next commit.
* data/skeletons/lalr1.java: Avoid space before parens.
* tests/java.at: Adjust.