Commit Graph

6295 Commits

Author SHA1 Message Date
Akim Demaille
99664706e2 traces: improve logs
* src/lalr.c: Move logs to a better place to understand the chronology
of events.
* src/symlist.c (symbol_list_syms_print): Don't dump core on type
elements.
2019-04-12 08:33:34 +02:00
Akim Demaille
a745041b7d doc: minor fixes
* doc/bison.texi: Use consistently $ and @kbd in shell examples.
Prefer sticking to English words: output and file instead of outfile
and infile.
2019-04-07 12:45:45 +02:00
Akim Demaille
0dd97f7c87 regen 2019-04-03 19:20:39 +02:00
Akim Demaille
69a823c72d bison: use no-lines
The 'regen' commit in Bison's history are a nuisance.  They are
especially big because of the #lines.  Let's generate our parse
without these lines in the repository, but generate them in the
tarball.

* Makefile.am (AM_YFLAGS_WITH_LINES): New.
(AM_YFLAGS): Use it.
(dist-hook): Regenerate the parser with #lines.
2019-04-03 19:20:39 +02:00
Akim Demaille
0f193d2d21 no-lines: avoid leaving an empty line instead of the syncline
Currently, with --no-lines, instead of "#line file line\n", we emit
"\n".  Let's emit nothing.

* data/skeletons/bison.m4 (b4_syncline): Emit at end-of-line when enabled.
* data/skeletons/bison.m4, data/skeletons/c.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, src/output.c: Use dnl after b4_syncline to
avoid spurious empty lines.

* tests/synclines.at (Sync Lines): Make sure that --no-lines is like
grep -v #line.
* tests/calc.at: Make sure that a rich grammar file behaves properly
with %no-lines.
2019-04-03 19:20:39 +02:00
Akim Demaille
9832fdd6ef java: use full locations for diagnostics about destructors
Currently we use the syncline to report errors about a symbol's
destructor/printer.  This is not accurate (only file and line), and
this is incorrect: the file name is double quotes (a recent change,
needed to make sure we escape properly double quotes in it).  And
worst of all: with --no-line, b4_syncline expands to nothing.

Rather, push the locations into the backend, and use them.

* src/muscle-tab.h, src/muscle-tab.c (muscle_location_grow): Make it
public.
* src/output.c (prepare_symbol_definitions): Use it to pubish the
location of the printer and destructor.
* data/skeletons/lalr1.java: Use complain_at instead of complain.
* tests/java.at (Java invalid directives): Adjust expectations.
* data/skeletons/bison.m4 (b4_symbol_action_location): Remove.
We should not use b4_syncline this way.
2019-04-03 19:20:39 +02:00
Akim Demaille
507c679b9b java: prefer errors to fatal errors
Fatal errors are inconvenient, and should be reserved to cases where
we cannot continue.  Here, it could even be warnings actually: these
directives will simply be ignored.

* data/skeletons/lalr1.java: Prefer error (b4_complain) to fatal
errors (b4_fatal).
* tests/java.at (Java invalid directives): New.
2019-04-03 19:20:39 +02:00
Akim Demaille
0b42cf8a36 tests: formatting changes
* tests/javapush.at: here.
2019-04-03 19:20:39 +02:00
Akim Demaille
10175e4a65 lalr: offer more flexibility in debugging routines
* src/state.h, src/state.c (state_transitions_print): New, extracted
from...
(state_transitions_set): here.
2019-04-03 07:29:54 +02:00
Akim Demaille
18831f985c lalr: don't overbook memory
I never understood why we book ngotos+1 slots for relations between
gotos: there are at most ngotos images, not ngotos+1 (and "includes"
does have cases where a goto is in relation with itself, so it's not
ngotos-1).

Maybe bbf37f2534 explains the +1: a bug
left us register a goto several times on occasion, and the +1 might
have been a means to avoid this problem in most cases.  Now that this
bug is addressed, we should no longer overbook memory, if only for the
clarity of the code ("why ngotos+1 instead of ngotos?").

* src/lalr.c: A goto has at most ngotos images, not ngotos+1.
While at it, avoid useless repeated call to map_goto introduced in
bbf37f2534.
2019-03-31 13:59:28 +02:00
Akim Demaille
6d4e6bf118 lalr: show lookback for debug
* src/lalr.c (lookback_print): New.
(build_relations): Use it.
Also show edges.
2019-03-30 17:34:56 +01:00
Akim Demaille
a8558bc5a6 diagnostics: don't crash when declaring the error token as an nterm
Reported by wcventure.
http://lists.gnu.org/archive/html/bug-bison/2019-03/msg00008.html

* src/symtab.c (complain_class_redeclared): Don't print empty
locations.
There can only be empty locations for predefined symbols.  And the
only symbol that is lexically available is the error token.  So this
appears to be the only possible way to have an error involving an
empty location.
* tests/input.at (Symbol class redefinition): Check it.
2019-03-30 16:37:47 +01:00
Akim Demaille
bbf37f2534 lalr: fix segmentation violation
The "includes" relation [DeRemer 1982] is between gotos, so of course,
for a given goto, there cannot be more that ngotos (number of gotos)
images.  But we manipulate the set of images of a goto as a list,
without checking that an image was not already introduced.  So we can
"register" way more images than ngotos, leading to a crash (heap
buffer overflow).

Reported by wcventure.
http://lists.gnu.org/archive/html/bug-bison/2019-03/msg00007.html

For the records, this bug is present in the first committed version of
Bison.

* src/lalr.c (build_relations): Don't insert the same goto several
times.
* tests/sets.at (Build Relations): New.
2019-03-30 10:10:39 +01:00
Akim Demaille
d332ff3c77 state: more debug traces
* src/state.c (state_transitions_set): Show the transitions.
2019-03-30 10:10:39 +01:00
Akim Demaille
eb92ec3dc6 style: rename variables for consistency
* src/lalr.c: Use trans for transitions, and reds for reductions, as
elsewhere in the code.
* src/state.h: Comment changes.
2019-03-30 10:10:39 +01:00
Akim Demaille
dee8fbbc1e gram: fix and improve log message
It seems that not many people read these logs: the error was
introduced in 2001 (3067fbef53),

* src/gram.c (grammar_dump): Fix the headers of the table: remove
duplicate display of "Ritem Range".
While at it, remove duplicate display of the rule number (and remove
an incorrect comment about it: these numbers _are_ equal).
* tests/sets.at (Reduced Grammar): Use useless rule, nterm and token
in the example.
2019-03-30 10:10:39 +01:00
Akim Demaille
75303c61d8 tests: add a tool for mass updates
When we update some output format, too many adjustements must be made
by hand.  This script updates most tests based on the actual output
made during the tests.

* build-aux/update-test: New.
2019-03-30 08:20:31 +01:00
Akim Demaille
af99826ef4 style: remove now useless _GL_UNUSED
* src/getargs.c (getargs_colors): Here.
Useless since 4d34b06fb3.
2019-03-25 08:39:50 +01:00
Theophile Ranquet
af1c6f973a tables: use bitsets for a performance boost
Suggested by Yuri at
<http://lists.gnu.org/archive/html/bison-patches/2012-01/msg00000.html>.

The improvement is marginal for most grammars, but notable for large
grammars (e.g., PosgreSQL's postgre.y), and very large for the
sample.y grammar submitted by Yuri in
http://lists.gnu.org/archive/html/bison-patches/2012-01/msg00012.html.
Measured with --trace=time -fsyntax-only.

parser action tables    postgre.y     sample.y
Before                 0,129 (44%)  37,095 (99%)
After                  0,117 (42%)   5,046 (93%)

* src/tables.c (pos): Replace this set of integer coded as an unsorted
array or integers with...
(pos_set): this bitset.
2019-03-24 19:16:19 +01:00
Akim Demaille
b5cd777ad6 yacc.c: don't suggest api.header.include when --defines is not used
See 4e19ab9fcd: the suggestion to
include the header file should not be emitted when the header is not
generated.

* data/skeletons/yacc.c: Here.
2019-03-24 18:52:58 +01:00
Akim Demaille
ae91c3cce3 reader: clarify variable names
* src/reader.c (grammar_rule_check_and_complete): When 'p' and 'lhs'
are aliases, prefer the latter, for clarity and consistency.
(grammar_current_rule_begin): Avoid 'p', current_rule suffices.
* src/gram.h, src/gram.c: Comment changes.

ptdr#	calc.tab.c
2019-03-24 18:40:46 +01:00
Akim Demaille
5de4e79fc8 diagnostics: style changes
* src/location.c (location_caret): Clarify a bit.
2019-03-24 18:40:46 +01:00
Akim Demaille
4d34b06fb3 diagnostics: use gnulib's libtextstyle-optional
Bruno Haible just added a default implementation of libtextstyle's
interface when the library is not available.
https://lists.gnu.org/archive/html/bison-patches/2019-03/msg00025.html

* gnulib: Update.
* bootstrap.conf: Replace libtextstyle with libtextstyle-optional.
* src/complain.c, src/getargs.c: Remove now useless cpp guards.
2019-03-24 18:40:46 +01:00
Akim Demaille
22a413ce9f diagnostics: fix handling of style in limit cases
* src/location.c (location_caret): Beware of the cases where the start
and end columns are the same, or when the location is multilines.
2019-03-23 10:21:18 +01:00
Akim Demaille
01855ca328 warnings: don't use _Noreturn with G++ 4.7 in C++98 mode
The timevar and bitset modules now use the c99 module which causes
$CXX to now include -std=gnu++11 when possible.  Unfortunately, G++
4.7 does not implement [[noreturn]] in C++11 mode, so our tests of
glr.cc (which uses _Noreturn) fail with

    input.cc:954:1: error: expected unqualified-id before '[' token

right before [[noreturn]].  4.8 works fine.

* data/skeletons/c.m4 (b4_attribute_define): Do not use [[noreturn]]
with GCC 4.7.
2019-03-23 10:15:11 +01:00
Akim Demaille
225bc5836a d: tests: use more a natural approach for the scanner
See f8408562f8.

* tests/calc.at: Stop imitating the C API.
Prepare more tests to run in the future.
%verbose works as expected (what a surprise, it's unrelated to the
skeleton...).
2019-03-17 16:43:36 +01:00
Akim Demaille
941cdf921d regen 2019-03-17 16:36:05 +01:00
Akim Demaille
58ae95670b style: rename spec_defines_file as spec_header_file
The variable spec_defines_file denotes the name of the generated
header.  Its name is derived from --defines/%defines, whose name in
turn is derived from the fact that the header, in Yacc, contained the

Not only does the header now contain a lot more than just the token
definitions, but we no longer even generate macros, but an enum...

Let's modernize our vocabulary.

* src/files.h, src/files.c (spec_defines_file): Rename as...
(spec_header_file): this.
2019-03-17 16:36:05 +01:00
Akim Demaille
4e19ab9fcd yacc.c: provide a means to include the header in the implementation
Currently when --defines is used, we generate a header, and paste an
exact copy of it into the generated parser implementation file.  Let's
provide a means to #include it instead.

We don't do it by default because of the Autotools' ylwrap.  This
program wraps invocations of yacc (that uses a fixed output name:
y.tab.c, y.tab.h, y.output) to support a more modern naming
scheme (dir/foo.y -> dir/foo.tab.c, dir/foo.tab.h, etc.).  It does
that by renaming the generated files, and then by running sed to
propagate these renamings inside the files themselves.

Unfortunately Automake's Makefiles uses Bison as if it were Yacc (with
--yacc or with -o y.tab.c) and invoke bison via ylwrap.  As a
consequence, as far as Bison is concerned, the output files are
y.tab.c and y.tab.h, so it emits '#include "y.tab.h"'.  So far, so
good.  But now ylwrap processes this '#include "y.tab.h"' into
'#include "dir/foo.tab.h"', which is not guaranteed to always work.

So, let's do the Right Thing when the output file is not y.tab.c, in
which case the user should %define api.header.include.  Binding this
behavior to --yacc is tempting, but we recently told people to stop
using --yacc (as it also enables the Yacc warnings), but rather to use
-o y.tab.c.

Yacc.c is the only skeleton concerned: all the others do include their
header.

* data/skeletons/yacc.c (b4_header_include_if): New.
(api.header.include): Provide a default value when the output is not
y.tab.c.
* src/parse-gram.y (api.header.include): Define.
2019-03-17 16:36:05 +01:00
Akim Demaille
6cb612e7e3 d: don't link against LIBS
* tests/local.at (AT_COMPILE_D): Don't pass LIBS, dmd does not like
being given -lintl.
2019-03-17 13:21:25 +01:00
Akim Demaille
35add841ee address warnings from GCC's UB sanitizer
Running with CC='gcc-mp-8 -fsanitize=undefined' revealed Undefined
Behaviors.
https://lists.gnu.org/archive/html/bison-patches/2019-03/msg00008.html

* src/state.c (errs_new): Don't call memcpy with NULL as source.
* src/location.c (add_column_width): Don't assume that the column
argument is nonnegative: the scanner sometimes "backtracks" (e.g., see
ROLLBACK_CURRENT_TOKEN and DEPRECATED) in which case we can have
negative column numbers (temporarily).
Found in test 3 (Invalid inputs).
2019-03-17 13:21:25 +01:00
Akim Demaille
f6e38d7ac9 diagnostics: use libtextstyle for colored output
Bruno Haible released libtextstyle, a library for colored output based
on CSS.  Let's use it to generate colored diagnostics, provided
libtextstyle is available.

See
https://lists.gnu.org/archive/html/bug-gnulib/2019-01/msg00176.html
https://lists.gnu.org/archive/html/bison-patches/2019-02/msg00073.html
https://lists.gnu.org/archive/html/bison-patches/2019-02/msg00084.html
https://lists.gnu.org/archive/html/bison-patches/2019-03/msg00007.html

* bootstrap.conf (gnulib_modules): Use libtextstyle when possible.
* data/diagnostics.css: New.
* src/complain.c (begin_use_class, end_use_class, flush)
(severity_style, complain_init_color): New.
Use them.
* src/getargs.c (getargs_colors): New.
(getargs): Use it.
Skip --color and --style.
* src/location.h, src/location.c (location_print): Use a style.

* tests/bison.in: Force --color=yes when stderr is a tty.
* tests/local.at: Disable colors during the test suite.
* tests/input.at: Adjust expectations to the extra options passed on
the command line.
2019-03-16 16:46:17 +01:00
Akim Demaille
855fbf1c11 style: clean up complain.c
* src/complain.c (severity_prefix): New.
(error_message): Take the severity as argument, instead of the prefix.
2019-03-16 16:46:17 +01:00
Akim Demaille
e5ec21215e yacc.c: emit the header before the implementation file
* data/skeletons/yacc.c: here.
This is more logical for the time stamps, but it's also required by
following patches: the shared declarations are also in charge of
handling api.value.type=union.  So far, they are run in the
implementation file in both cases (with or without header).  But if we
run them only in the header, then the implementation file is emited
with incorrect support for api.value.type=union.
Arguably we should not have such dependencies.  This is because we
have side-effects in our backend (redefining the symbols' type and
type_tag).  In the future we should find a better solution for this,
without sacrificing the independence of the backend from bison
itself (i.e., I don't think we should handle api.value.type=union in
bison, leave it to m4).
2019-03-16 10:14:18 +01:00
Akim Demaille
91bbf4219d simplify the generated #line
Currently we generate things like:

    #line 683 "src/parse-gram.y" /* yacc.c:316  */

The first part is of course very important: compilers point the users
to their grammar file rather than into the generated parser.  The
second part points to the place in the skeletons that generated this
piece of code.

This dependency on the Bison skeletons generates lots of useless 'git
diff'.  This location is useless for the regular user (who does not
care about the skeletons) and is actually not useful for Bison
developpers too (I never used this to locate the code in skeletons
that generated output).  So disable it completely.  If someone thinks
this was actually useful, a %define variable should be provided to
control the level of verbosity of '#line', in replacement of
--no-lines.

So now, generate:

    #line 683 "src/parse-gram.y"

* data/skeletons/bison.m4 (b4_sync_end): Emit nothing.
2019-03-16 10:12:09 +01:00
Akim Demaille
f4c1586454 gnulib: update 2019-03-13 08:21:34 +01:00
Akim Demaille
9a71d9d1c6 tests: remove duplicates
* tests/regression.at (Invalid inputs, Invalid inputs with {}):
Remove, there are exact copies of them in input.at.
2019-03-13 08:21:34 +01:00
Akim Demaille
bb0310a353 d: simplify the API to build the scanner of the example
* examples/d/calc.y (calcLexer): Add an overload for File.
Use it.
2019-03-02 09:56:41 +01:00
H. S. Teoh
f8408562f8 d: modernize the scanner of the example
https://lists.gnu.org/archive/html/bison-patches/2019-02/msg00121.html

* examples/d/calc.y (CalcLexer): Stop shoehorning C's API into D: use
a range based approach in the scanner, rather than some imitation of
getc/ungetc.
(main): Adjust.
2019-03-01 21:56:06 +01:00
Akim Demaille
8eac78f8ef d: tests: use fewer global variables
* tests/calc.at: Move 'input' into the scanner.
2019-03-01 21:56:06 +01:00
Akim Demaille
d57751d2fb lalr: clarify the count of lookaheads
* src/lalr.c (state_lookahead_tokens_count): Remove wierd `+=` that is
actually an `=`.
2019-02-28 06:47:19 +01:00
Akim Demaille
e062b9f70d lalr: clarify the API
* src/state.h, src/state.c (state_reduction_find): Clarify.
Die on errors.
* src/lalr.c (goto_list_new): New.
Use it.
2019-02-28 06:47:19 +01:00
Akim Demaille
c837141832 lalr: improve traces
* src/lalr.c (follows_print): Just print the symbol tag.
Take and print a title.
Indent the output.
Use it to print the various steps of the computation.
(lookahead_tokens_print): Fix a lie: the number displayed is not the
number of tokens.
Don't display states that don't even have reductions.
2019-02-28 06:47:19 +01:00
Akim Demaille
a415a78d71 lalr: print the 'reads' relation
* src/relation.h, src/relation.c (relation_print): Accept and use a
title.
Don't print empty rows.
Indent the output.
Adjust dependencies.
* src/lalr.c (initialize_goto_follows): Print 'reads' in traces.
2019-02-27 19:06:32 +01:00
Akim Demaille
5255b919ae style: comment changes
* src/lr0.c: here.
2019-02-27 19:06:32 +01:00
Akim Demaille
b12f9c76e2 dlang: initial changes to run the calc tests on it
* configure.ac (DCFLAGS): Define.
* tests/atlocal.in: Receive it.
* data/skeletons/d.m4 (api.parser.class): Remove spurious YY.
* data/skeletons/lalr1.d (yylex): Return an int instead of a
YYTokenType, so that we can use characters as tokens.
* examples/d/calc.y: Adjust.
* tests/local.at: Initial support for D.
(AT_D_IF, AT_DATA_GRAMMAR(D), AT_YYERROR_DECLARE(d))
(AT_YYERROR_DECLARE_EXTERN(d), AT_YYERROR_DEFINE(d))
(AT_MAIN_DEFINE(d), AT_COMPILE_D, AT_LANG_COMPILE(d), AT_LANG_EXT(d)):
New.
* tests/calc.at: Initial support for D.
* tests/headers.at
2019-02-26 18:27:13 +01:00
Akim Demaille
575b814119 d: improve the example
* examples/d/calc.y: Exit with failure on errors.
Remove useless operators (=, !) meant for the test suite.
Add unary + for symmetry.
* examples/d/calc.test: Adjust expectations.
2019-02-26 08:42:24 +01:00
Akim Demaille
661bbacfc7 tests: style changes
* tests/local.at AT_YYERROR_DEFINE(java): Use more consistent names.
2019-02-26 08:42:24 +01:00
Akim Demaille
d04962f788 style: eliminate useless indirection
* src/relation.h, src/relation.c (relation_digraph): Don't take the
biteetv as a pointer, it is already a pointer (as it's an array).
2019-02-25 06:19:55 +01:00
Akim Demaille
ec8142391a style: rename function for clarity
Commit db34f79889 renames the variable F
as goto_follows, but forgot to rename this function.

* src/lalr.c (initialize_F): Rename as...
(initialize_goto_follows): this.
2019-02-25 06:19:55 +01:00