Commit Graph

6284 Commits

Author SHA1 Message Date
Akim Demaille
a8558bc5a6 diagnostics: don't crash when declaring the error token as an nterm
Reported by wcventure.
http://lists.gnu.org/archive/html/bug-bison/2019-03/msg00008.html

* src/symtab.c (complain_class_redeclared): Don't print empty
locations.
There can only be empty locations for predefined symbols.  And the
only symbol that is lexically available is the error token.  So this
appears to be the only possible way to have an error involving an
empty location.
* tests/input.at (Symbol class redefinition): Check it.
2019-03-30 16:37:47 +01:00
Akim Demaille
bbf37f2534 lalr: fix segmentation violation
The "includes" relation [DeRemer 1982] is between gotos, so of course,
for a given goto, there cannot be more that ngotos (number of gotos)
images.  But we manipulate the set of images of a goto as a list,
without checking that an image was not already introduced.  So we can
"register" way more images than ngotos, leading to a crash (heap
buffer overflow).

Reported by wcventure.
http://lists.gnu.org/archive/html/bug-bison/2019-03/msg00007.html

For the records, this bug is present in the first committed version of
Bison.

* src/lalr.c (build_relations): Don't insert the same goto several
times.
* tests/sets.at (Build Relations): New.
2019-03-30 10:10:39 +01:00
Akim Demaille
d332ff3c77 state: more debug traces
* src/state.c (state_transitions_set): Show the transitions.
2019-03-30 10:10:39 +01:00
Akim Demaille
eb92ec3dc6 style: rename variables for consistency
* src/lalr.c: Use trans for transitions, and reds for reductions, as
elsewhere in the code.
* src/state.h: Comment changes.
2019-03-30 10:10:39 +01:00
Akim Demaille
dee8fbbc1e gram: fix and improve log message
It seems that not many people read these logs: the error was
introduced in 2001 (3067fbef53),

* src/gram.c (grammar_dump): Fix the headers of the table: remove
duplicate display of "Ritem Range".
While at it, remove duplicate display of the rule number (and remove
an incorrect comment about it: these numbers _are_ equal).
* tests/sets.at (Reduced Grammar): Use useless rule, nterm and token
in the example.
2019-03-30 10:10:39 +01:00
Akim Demaille
75303c61d8 tests: add a tool for mass updates
When we update some output format, too many adjustements must be made
by hand.  This script updates most tests based on the actual output
made during the tests.

* build-aux/update-test: New.
2019-03-30 08:20:31 +01:00
Akim Demaille
af99826ef4 style: remove now useless _GL_UNUSED
* src/getargs.c (getargs_colors): Here.
Useless since 4d34b06fb3.
2019-03-25 08:39:50 +01:00
Theophile Ranquet
af1c6f973a tables: use bitsets for a performance boost
Suggested by Yuri at
<http://lists.gnu.org/archive/html/bison-patches/2012-01/msg00000.html>.

The improvement is marginal for most grammars, but notable for large
grammars (e.g., PosgreSQL's postgre.y), and very large for the
sample.y grammar submitted by Yuri in
http://lists.gnu.org/archive/html/bison-patches/2012-01/msg00012.html.
Measured with --trace=time -fsyntax-only.

parser action tables    postgre.y     sample.y
Before                 0,129 (44%)  37,095 (99%)
After                  0,117 (42%)   5,046 (93%)

* src/tables.c (pos): Replace this set of integer coded as an unsorted
array or integers with...
(pos_set): this bitset.
2019-03-24 19:16:19 +01:00
Akim Demaille
b5cd777ad6 yacc.c: don't suggest api.header.include when --defines is not used
See 4e19ab9fcd: the suggestion to
include the header file should not be emitted when the header is not
generated.

* data/skeletons/yacc.c: Here.
2019-03-24 18:52:58 +01:00
Akim Demaille
ae91c3cce3 reader: clarify variable names
* src/reader.c (grammar_rule_check_and_complete): When 'p' and 'lhs'
are aliases, prefer the latter, for clarity and consistency.
(grammar_current_rule_begin): Avoid 'p', current_rule suffices.
* src/gram.h, src/gram.c: Comment changes.

ptdr#	calc.tab.c
2019-03-24 18:40:46 +01:00
Akim Demaille
5de4e79fc8 diagnostics: style changes
* src/location.c (location_caret): Clarify a bit.
2019-03-24 18:40:46 +01:00
Akim Demaille
4d34b06fb3 diagnostics: use gnulib's libtextstyle-optional
Bruno Haible just added a default implementation of libtextstyle's
interface when the library is not available.
https://lists.gnu.org/archive/html/bison-patches/2019-03/msg00025.html

* gnulib: Update.
* bootstrap.conf: Replace libtextstyle with libtextstyle-optional.
* src/complain.c, src/getargs.c: Remove now useless cpp guards.
2019-03-24 18:40:46 +01:00
Akim Demaille
22a413ce9f diagnostics: fix handling of style in limit cases
* src/location.c (location_caret): Beware of the cases where the start
and end columns are the same, or when the location is multilines.
2019-03-23 10:21:18 +01:00
Akim Demaille
01855ca328 warnings: don't use _Noreturn with G++ 4.7 in C++98 mode
The timevar and bitset modules now use the c99 module which causes
$CXX to now include -std=gnu++11 when possible.  Unfortunately, G++
4.7 does not implement [[noreturn]] in C++11 mode, so our tests of
glr.cc (which uses _Noreturn) fail with

    input.cc:954:1: error: expected unqualified-id before '[' token

right before [[noreturn]].  4.8 works fine.

* data/skeletons/c.m4 (b4_attribute_define): Do not use [[noreturn]]
with GCC 4.7.
2019-03-23 10:15:11 +01:00
Akim Demaille
225bc5836a d: tests: use more a natural approach for the scanner
See f8408562f8.

* tests/calc.at: Stop imitating the C API.
Prepare more tests to run in the future.
%verbose works as expected (what a surprise, it's unrelated to the
skeleton...).
2019-03-17 16:43:36 +01:00
Akim Demaille
941cdf921d regen 2019-03-17 16:36:05 +01:00
Akim Demaille
58ae95670b style: rename spec_defines_file as spec_header_file
The variable spec_defines_file denotes the name of the generated
header.  Its name is derived from --defines/%defines, whose name in
turn is derived from the fact that the header, in Yacc, contained the

Not only does the header now contain a lot more than just the token
definitions, but we no longer even generate macros, but an enum...

Let's modernize our vocabulary.

* src/files.h, src/files.c (spec_defines_file): Rename as...
(spec_header_file): this.
2019-03-17 16:36:05 +01:00
Akim Demaille
4e19ab9fcd yacc.c: provide a means to include the header in the implementation
Currently when --defines is used, we generate a header, and paste an
exact copy of it into the generated parser implementation file.  Let's
provide a means to #include it instead.

We don't do it by default because of the Autotools' ylwrap.  This
program wraps invocations of yacc (that uses a fixed output name:
y.tab.c, y.tab.h, y.output) to support a more modern naming
scheme (dir/foo.y -> dir/foo.tab.c, dir/foo.tab.h, etc.).  It does
that by renaming the generated files, and then by running sed to
propagate these renamings inside the files themselves.

Unfortunately Automake's Makefiles uses Bison as if it were Yacc (with
--yacc or with -o y.tab.c) and invoke bison via ylwrap.  As a
consequence, as far as Bison is concerned, the output files are
y.tab.c and y.tab.h, so it emits '#include "y.tab.h"'.  So far, so
good.  But now ylwrap processes this '#include "y.tab.h"' into
'#include "dir/foo.tab.h"', which is not guaranteed to always work.

So, let's do the Right Thing when the output file is not y.tab.c, in
which case the user should %define api.header.include.  Binding this
behavior to --yacc is tempting, but we recently told people to stop
using --yacc (as it also enables the Yacc warnings), but rather to use
-o y.tab.c.

Yacc.c is the only skeleton concerned: all the others do include their
header.

* data/skeletons/yacc.c (b4_header_include_if): New.
(api.header.include): Provide a default value when the output is not
y.tab.c.
* src/parse-gram.y (api.header.include): Define.
2019-03-17 16:36:05 +01:00
Akim Demaille
6cb612e7e3 d: don't link against LIBS
* tests/local.at (AT_COMPILE_D): Don't pass LIBS, dmd does not like
being given -lintl.
2019-03-17 13:21:25 +01:00
Akim Demaille
35add841ee address warnings from GCC's UB sanitizer
Running with CC='gcc-mp-8 -fsanitize=undefined' revealed Undefined
Behaviors.
https://lists.gnu.org/archive/html/bison-patches/2019-03/msg00008.html

* src/state.c (errs_new): Don't call memcpy with NULL as source.
* src/location.c (add_column_width): Don't assume that the column
argument is nonnegative: the scanner sometimes "backtracks" (e.g., see
ROLLBACK_CURRENT_TOKEN and DEPRECATED) in which case we can have
negative column numbers (temporarily).
Found in test 3 (Invalid inputs).
2019-03-17 13:21:25 +01:00
Akim Demaille
f6e38d7ac9 diagnostics: use libtextstyle for colored output
Bruno Haible released libtextstyle, a library for colored output based
on CSS.  Let's use it to generate colored diagnostics, provided
libtextstyle is available.

See
https://lists.gnu.org/archive/html/bug-gnulib/2019-01/msg00176.html
https://lists.gnu.org/archive/html/bison-patches/2019-02/msg00073.html
https://lists.gnu.org/archive/html/bison-patches/2019-02/msg00084.html
https://lists.gnu.org/archive/html/bison-patches/2019-03/msg00007.html

* bootstrap.conf (gnulib_modules): Use libtextstyle when possible.
* data/diagnostics.css: New.
* src/complain.c (begin_use_class, end_use_class, flush)
(severity_style, complain_init_color): New.
Use them.
* src/getargs.c (getargs_colors): New.
(getargs): Use it.
Skip --color and --style.
* src/location.h, src/location.c (location_print): Use a style.

* tests/bison.in: Force --color=yes when stderr is a tty.
* tests/local.at: Disable colors during the test suite.
* tests/input.at: Adjust expectations to the extra options passed on
the command line.
2019-03-16 16:46:17 +01:00
Akim Demaille
855fbf1c11 style: clean up complain.c
* src/complain.c (severity_prefix): New.
(error_message): Take the severity as argument, instead of the prefix.
2019-03-16 16:46:17 +01:00
Akim Demaille
e5ec21215e yacc.c: emit the header before the implementation file
* data/skeletons/yacc.c: here.
This is more logical for the time stamps, but it's also required by
following patches: the shared declarations are also in charge of
handling api.value.type=union.  So far, they are run in the
implementation file in both cases (with or without header).  But if we
run them only in the header, then the implementation file is emited
with incorrect support for api.value.type=union.
Arguably we should not have such dependencies.  This is because we
have side-effects in our backend (redefining the symbols' type and
type_tag).  In the future we should find a better solution for this,
without sacrificing the independence of the backend from bison
itself (i.e., I don't think we should handle api.value.type=union in
bison, leave it to m4).
2019-03-16 10:14:18 +01:00
Akim Demaille
91bbf4219d simplify the generated #line
Currently we generate things like:

    #line 683 "src/parse-gram.y" /* yacc.c:316  */

The first part is of course very important: compilers point the users
to their grammar file rather than into the generated parser.  The
second part points to the place in the skeletons that generated this
piece of code.

This dependency on the Bison skeletons generates lots of useless 'git
diff'.  This location is useless for the regular user (who does not
care about the skeletons) and is actually not useful for Bison
developpers too (I never used this to locate the code in skeletons
that generated output).  So disable it completely.  If someone thinks
this was actually useful, a %define variable should be provided to
control the level of verbosity of '#line', in replacement of
--no-lines.

So now, generate:

    #line 683 "src/parse-gram.y"

* data/skeletons/bison.m4 (b4_sync_end): Emit nothing.
2019-03-16 10:12:09 +01:00
Akim Demaille
f4c1586454 gnulib: update 2019-03-13 08:21:34 +01:00
Akim Demaille
9a71d9d1c6 tests: remove duplicates
* tests/regression.at (Invalid inputs, Invalid inputs with {}):
Remove, there are exact copies of them in input.at.
2019-03-13 08:21:34 +01:00
Akim Demaille
bb0310a353 d: simplify the API to build the scanner of the example
* examples/d/calc.y (calcLexer): Add an overload for File.
Use it.
2019-03-02 09:56:41 +01:00
H. S. Teoh
f8408562f8 d: modernize the scanner of the example
https://lists.gnu.org/archive/html/bison-patches/2019-02/msg00121.html

* examples/d/calc.y (CalcLexer): Stop shoehorning C's API into D: use
a range based approach in the scanner, rather than some imitation of
getc/ungetc.
(main): Adjust.
2019-03-01 21:56:06 +01:00
Akim Demaille
8eac78f8ef d: tests: use fewer global variables
* tests/calc.at: Move 'input' into the scanner.
2019-03-01 21:56:06 +01:00
Akim Demaille
d57751d2fb lalr: clarify the count of lookaheads
* src/lalr.c (state_lookahead_tokens_count): Remove wierd `+=` that is
actually an `=`.
2019-02-28 06:47:19 +01:00
Akim Demaille
e062b9f70d lalr: clarify the API
* src/state.h, src/state.c (state_reduction_find): Clarify.
Die on errors.
* src/lalr.c (goto_list_new): New.
Use it.
2019-02-28 06:47:19 +01:00
Akim Demaille
c837141832 lalr: improve traces
* src/lalr.c (follows_print): Just print the symbol tag.
Take and print a title.
Indent the output.
Use it to print the various steps of the computation.
(lookahead_tokens_print): Fix a lie: the number displayed is not the
number of tokens.
Don't display states that don't even have reductions.
2019-02-28 06:47:19 +01:00
Akim Demaille
a415a78d71 lalr: print the 'reads' relation
* src/relation.h, src/relation.c (relation_print): Accept and use a
title.
Don't print empty rows.
Indent the output.
Adjust dependencies.
* src/lalr.c (initialize_goto_follows): Print 'reads' in traces.
2019-02-27 19:06:32 +01:00
Akim Demaille
5255b919ae style: comment changes
* src/lr0.c: here.
2019-02-27 19:06:32 +01:00
Akim Demaille
b12f9c76e2 dlang: initial changes to run the calc tests on it
* configure.ac (DCFLAGS): Define.
* tests/atlocal.in: Receive it.
* data/skeletons/d.m4 (api.parser.class): Remove spurious YY.
* data/skeletons/lalr1.d (yylex): Return an int instead of a
YYTokenType, so that we can use characters as tokens.
* examples/d/calc.y: Adjust.
* tests/local.at: Initial support for D.
(AT_D_IF, AT_DATA_GRAMMAR(D), AT_YYERROR_DECLARE(d))
(AT_YYERROR_DECLARE_EXTERN(d), AT_YYERROR_DEFINE(d))
(AT_MAIN_DEFINE(d), AT_COMPILE_D, AT_LANG_COMPILE(d), AT_LANG_EXT(d)):
New.
* tests/calc.at: Initial support for D.
* tests/headers.at
2019-02-26 18:27:13 +01:00
Akim Demaille
575b814119 d: improve the example
* examples/d/calc.y: Exit with failure on errors.
Remove useless operators (=, !) meant for the test suite.
Add unary + for symmetry.
* examples/d/calc.test: Adjust expectations.
2019-02-26 08:42:24 +01:00
Akim Demaille
661bbacfc7 tests: style changes
* tests/local.at AT_YYERROR_DEFINE(java): Use more consistent names.
2019-02-26 08:42:24 +01:00
Akim Demaille
d04962f788 style: eliminate useless indirection
* src/relation.h, src/relation.c (relation_digraph): Don't take the
biteetv as a pointer, it is already a pointer (as it's an array).
2019-02-25 06:19:55 +01:00
Akim Demaille
ec8142391a style: rename function for clarity
Commit db34f79889 renames the variable F
as goto_follows, but forgot to rename this function.

* src/lalr.c (initialize_F): Rename as...
(initialize_goto_follows): this.
2019-02-25 06:19:55 +01:00
Akim Demaille
59bec5fade lalr: more debug traces
I need to be able to read includes and goto_follows.

* src/relation.h, src/relation.c (relation_print): Provide a means to
pretty-print the nodes of the relation.
* src/lalr.c (goto_print, follows_print): New.
(set_goto_map): Use goto_print.
(build_relations): Show INCLUDES.
(compute_FOLLOWS): Rename as...
(compute_follows): this.
Show FOLLOWS.
2019-02-25 06:19:54 +01:00
Akim Demaille
5230e610fc style: minor changes
* examples/c/calc/calc.y, src/lalr.c: Reduce scope.
* src/gram.c: Prefer < to >.
2019-02-24 19:08:01 +01:00
Akim Demaille
b81419a9fd style: clarify the computation of the lookback edges
* src/lalr.c (build_relations): Reduce the scopes.
Instead of keeping rp alive in two different loops, clarify the second
one by having an index on the path we traverse (i.e., use that index
to compute the source state _and_ the symbol that labels the
transition).
This allows to turn an obscure 'while'-loop in a clearer (IMHO)
'for'-loop.  We also consume more variables (by introducing p instead
of making more side effects on length), but we're in 2019, I don't
think this matters.  What does matter is that (IMHO again), this is
now clearer.
Also, use clearer names.
2019-02-24 19:07:32 +01:00
Akim Demaille
2b9ee006d8 style: scope reduction in tables.c
* src/tables.c: here.
* src/lalr.c: Prefer < to >.
2019-02-24 12:00:44 +01:00
Akim Demaille
609b40f1a1 d: formatting changes
* data/skeletons/d.m4, data/skeletons/lalr1.d: Avoid trailing spaces.
2019-02-24 07:03:59 +01:00
Akim Demaille
1e76448ced examples: remove stray examples
* examples/c/reentrant-calc: Remove.
I did not mean to include this example, it was replaced by
examples/c/reccalc.
2019-02-23 11:42:55 +01:00
Akim Demaille
967a59d2c0 tests: factor the execution of Java parsers
* tests/local.at (AT_MAIN_DEFINE(java)): Exit failure on failure.
(AT_PARSER_CHECK): If in Java, run AT_JAVA_PARSER_CHECK.
* tests/conflicts.at (AT_CONSISTENT_ERRORS_CHECK): Simplify.
2019-02-21 17:46:11 +01:00
Akim Demaille
fbf94ac900 tests: fix a Java tests
* tests/conflicts.at (AT_CONSISTENT_ERRORS_CHECK): Fix quotation error.
2019-02-21 17:46:11 +01:00
Akim Demaille
a11c144609 tests: simplify AT_PARSER_CHECK usage
Currently the caller must specify the ./ prefix to its command.  Let's
avoid that: it will be nicer to read, make it easier to have a version
that works for Java and C/C++.

* tests/local.at (AT_PARSER_CHECK): Prefix the command with ./.
Adjust callers.
2019-02-21 17:46:11 +01:00
Akim Demaille
4848092bf8 tests: java: factor the definition of Position
* tests/local.at (AT_JAVA_POSITION_DEFINE): New.
* tests/java.at, tests/javapush.at: Use it.
2019-02-21 17:46:11 +01:00
Akim Demaille
948f3decb4 tests: dispatch per lang on AT_DATA_GRAMMAR
* tests/java.at: Do that.
* tests/conflicts.at: Simplify.

* tests/actions.at, tests/c++.at, tests/input.at, tests/local.at,
* tests/named-refs.at:
Use AT_BISON_OPTION_PUSHDEFS/AT_BISON_OPTION_POPDEFS.
2019-02-21 17:46:11 +01:00