Commit Graph

2690 Commits

Author SHA1 Message Date
Akim Demaille
a1aef9825a regen 2020-01-17 06:49:45 +01:00
Akim Demaille
a0675d707f Merge branch 'maint' into HEAD
* maint:
  gnulib: update
  lalr1.cc: avoid static_cast
  glr.c: add missing cast
  regen
  package: bump copyrights to 2020
  gitignore: update
2020-01-11 07:38:39 +01:00
Akim Demaille
04f3bfc596 regen 2020-01-10 19:21:35 +01:00
Akim Demaille
c67daa9a97 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-10 19:16:23 +01:00
Akim Demaille
e419eda5e2 regen 2020-01-10 18:38:02 +01:00
Akim Demaille
8036635251 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-05 10:26:35 +01:00
Akim Demaille
d55f240991 parser: pretend we are Bison 3.5
* src/parse-gram.y: Accept we're Bison 3.5.
2019-12-08 16:03:36 +01:00
Akim Demaille
6dca1eb950 regen 2019-12-06 08:27:55 +01:00
Akim Demaille
9e9e49224f diagnostics: style changes
* src/complain.h, src/complain.c: Comment changes.
* src/scan-skel.l: Reduce scopes.
* data/skeletons/bison.m4: Factor diagnostic functions.
2019-12-02 19:35:01 +01:00
Akim Demaille
ad32ec64c8 style: pacify syntax-check
* cfg.mk: No need to translate *.md files.
* data/skeletons/glr.c, data/skeletons/yacc.c: Fix space issues.
2019-11-20 07:10:27 +01:00
Akim Demaille
8a910107b3 diagnostics: complain about undeclared string tokens
String literals, which allow for better error messages, are (too)
liberally accepted by Bison, which might result in silent errors.  For
instance

    %type <exVal> cond "condition"

does not define “condition” as a string alias to 'cond' (nonterminal
symbols do not have string aliases).  It is rather equivalent to

    %nterm <exVal> cond
    %token <exVal> "condition"

i.e., it gives the type 'exVal' to the "condition" token, which was
clearly not the intention.

Introduce -Wdangling-alias to catch this.

* src/complain.h, src/complain.c: Add support for -Wdangling-alias.
(argmatch_warning_args): Sort.
* src/symtab.c (symbol_check_defined): Complain about dangling
aliases.
* doc/bison.texi: Document it.
* tests/input.at (Dangling aliases): New test.
2019-11-17 18:27:42 +01:00
Akim Demaille
28d1ca8f48 diagnostics: yacc reserves %type to nonterminals
On

    %token TOKEN1
    %type  <ival> TOKEN1 TOKEN2 't'
    %token TOKEN2
    %%
    expr:

bison -Wyacc gives

    input.y:2.15-20: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |               ^~~~~~
    input.y:2.29-31: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |                             ^~~
    input.y:2.22-27: warning: POSIX yacc reserves %type to nonterminals [-Wyacc]
        2 | %type  <ival> TOKEN1 TOKEN2 't'
          |                      ^~~~~~

The messages appear to be out of order, but they are emitted when the
error is found.

* src/symtab.h (symbol_class): Add pct_type_sym, used to denote
symbols appearing in %type.
* src/symtab.c (complain_pct_type_on_token): New.
(symbol_class_set): Check that %type is not applied to tokens.
(symbol_check_defined): pct_type_sym also means undefined.
* src/parse-gram.y (symbol_decl.1): Set the class to pct_type_sym.
* src/reader.c (grammar_current_rule_begin): pct_type_sym also means
undefined.
* tests/input.at (Yacc's %type): New.
2019-11-17 09:45:25 +01:00
Akim Demaille
60ebd8e210 regen 2019-11-16 12:54:44 +01:00
kaneko y
3765e3e790 gram.c: Fix condition of aver
* src/gram.c (grammar_dump): Fix condition of aver.
What we want to check is that rhs is followed by its rule.
2019-11-12 08:39:28 +01:00
Yuichiro Kaneko
17d34c231b gram.c: also print terminals in grammar_dump
* src/gram.c (grammar_dump): Print terminals likewise non terminals.
* tests/sets.at (Reduced Grammar): Update test case to catch up the
change and add a test case where prec and assoc are used.
2019-11-11 10:37:30 +01:00
Akim Demaille
cce6c998b6 diagnostics: add missing translation
* src/muscle-tab.c (muscle_percent_define_check_kind): Here.
2019-11-03 09:24:12 +01:00
Akim Demaille
c53b379784 style: fix cpp indentation
Reported by syntax-check.

* src/system.h: here.
2019-10-29 09:00:46 +01:00
Akim Demaille
8228d96d33 reader: reduce the "scope" of global variables
We have too many global variables, adding structure would help.  For a
start, let's hide some of the variables closer to their usage.

* src/getargs.c, src/files.h (current_file): Move to...
* src/scan-gram.c: here.
* src/scan-gram.h (gram_in, gram__flex_debug): Remove, make them
private to the scanner.
* src/reader.h, src/reader.c (reader): Take a grammar file as argument.
Move the handling of scanner variables to...
* src/scan-gram.l (gram_scanner_open, gram_scanner_close): here.
(gram_scanner_initialize): Remove, replaced by gram_scanner_open.
* src/main.c: Adjust.
2019-10-26 10:39:01 +02:00
Akim Demaille
a5fc4e3b44 regen 2019-10-26 10:39:01 +02:00
Akim Demaille
3be912e4af parser: use grammar_file instead of current_file
* src/parse-gram (%initial-action): here.
(handle_skeleton): Don't depend on the current file name to look for
"local" skeletons (subject to changes coming from "#lines"): depend
only on the initial file name, the one given on the command line.
2019-10-26 10:38:39 +02:00
Akim Demaille
4b4e532748 diagnostics: use grammar_file instead of current_file
Currently there are two globals denoting the input file: grammar_file
is the one from the command line, and current_file which might change
because of #line.  Use only the former.

* src/complain.c (error_message): here.
* tests/diagnostics.at: Adjust.
2019-10-26 09:11:40 +02:00
Akim Demaille
6e7d8ba6a7 reader: let symtab deal with the symbols
* src/reader.c (reader): Move the setting up of the builtin symbols to...
* src/symtab.c (symbols_new): here.
2019-10-25 07:48:07 +02:00
Akim Demaille
c680300a29 style: remove incorrect comment
Reported by Paul Eggert.

* src/system.h: here.
2019-10-25 07:41:38 +02:00
Akim Demaille
fa9871a2fb diagnostics: simplify location handling
Locations start at line 1.  Don't accept line 0.

* src/location.c (location_print): Don't print locations with line 0.
(location_caret): Simplify.
2019-10-24 18:00:43 +02:00
Akim Demaille
76597d01f3 build: reenable -Wtype-limits
See https://lists.gnu.org/archive/html/bug-bison/2019-10/msg00061.html
to https://lists.gnu.org/archive/html/bug-bison/2019-10/msg00073.html.

Paul Eggert's changes in gnulib do fix the issue for modern GCCs (7,
8, 9) on macOS.  Unfortunately these warnings are back on the
CI (GNU/Linux) with GCC 4.6, 4.7, (not 4.8) and 4.9.

Disable the warning locally.

* configure.ac (warn_common, warn_tests): Remove -Wtype-limits.
* src/system.h (IGNORE_TYPE_LIMITS_BEGIN, IGNORE_TYPE_LIMITS_END): New.
* src/InadequacyList.c, src/parse-gram.c, src/parse-gram.y,
* src/symtab.c: Use it.
2019-10-24 08:50:14 +02:00
Akim Demaille
bc5efb558d build: remove dmalloc support
Today sanitizers are a better alternative.

* m4/dmalloc.m4: Remove.
* configure.ac, src/system.h: Adjust.
2019-10-24 07:22:17 +02:00
Yuichiro Kaneko
3945beb1d2 style: update comment in reader.c
rrhs and rlhs were removed by b2ed6e5826.

* src/reader.c (packgram): Update comment.
2019-10-23 08:32:06 +02:00
Akim Demaille
048730c691 style: pacify syntax-check
* doc/.gitignore, src/complain.c, src/getargs.c,
* src/output.c: here.
2019-10-22 10:40:12 +02:00
Akim Demaille
ec64a0bc7e main: also free memory on errors
* src/derives.c (derives_free): Beware of NULL.
* src/main.c (main): Let the 'finish' label include memory release.
2019-10-21 17:18:32 +02:00
Akim Demaille
d76ea5ce06 style: reduce scope in derives
* src/derives.c: here.
And prefer prefix to postfix increment.
2019-10-21 17:18:32 +02:00
Akim Demaille
97d6da0c5b parser: clarify version checking
* src/parse-gram.y: Use the same conventions for gnulib as elsewhere:
<header.h>.
(str_to_version): New.
(handle_require): Use it.
Prefer < to >.
2019-10-20 17:57:28 +02:00
Paul Eggert
693e69f289 regen 2019-10-17 11:51:20 -07:00
Paul Eggert
8a4ec5d4e4 bison: check for int overflow in token numbers
* src/symtab.c: Include intprops.h
(symbol_user_token_number_set): Don’t allow user_token_number ==
INT_MAX because too much other code adds 1 to the user token number.
(symbols_token_translations_init): Complain on integer overflow
instead of indulging in undefined behavior.
2019-10-17 11:51:20 -07:00
Paul Eggert
052215a138 bison: check for int overflow when scanning
* src/scan-gram.l: Include errno.h, for errno.
(scan_integer, handle_syncline): Check for integer overflow.
* tests/input.at (too-large.y): Adjust to match new diagnostics.
2019-10-17 11:51:20 -07:00
Paul Eggert
15c1b913cf bison: check version numbers more carefully
* src/parse-gram.y: Include intprops.h.
(handle_require): Don’t indulge in undefined behavior if the major
or minor number is out of range.  Instead, check that the
resulting value is nonnegative, fits in int, and that the minor
number is less than 100.  Also, check that a number was parsed.
2019-10-17 11:51:20 -07:00
Akim Demaille
d9d37a1196 i18n: don't push too hard for '…'
Suggested by Paul Eggert.

* src/location.c (ellipsis): Clarify comment for translators.
2019-10-12 10:43:53 +02:00
Akim Demaille
c3db1394a1 regen 2019-10-11 08:52:04 +02:00
Akim Demaille
2c66acfec0 diagnostics: prefer "…" to "..." if the locale supports it
* src/location.c (ellipsis, ellipsize): New.
Use them.
2019-10-10 21:57:50 +02:00
Paul Eggert
5463291a91 Use “least” types for integers in Yacc tables
This changes the Yacc skeleton to use “least” integer types to
keep tables smaller on some platforms, which should lessen cache
pressure.  Since Bison uses the Yacc skeleton, it follows suit.
* data/skeletons/yacc.c: Include limits.h and stdint.h if this
seems to be needed.
(yytype_uint8, yytype_int8, yytype_uint16, yytype_int16):
If available, use GCC predefined macros __INT_MAX__ etc. to select
a “least” type, as this avoids namespace hassles.  Otherwise, if
available fall back on selecting a “least” type via the C99 macros
INT_MAX, INT_LEAST8_MAX, etc.  Otherwise, fall further back on one of
the builtin C99 types signed char, short, and int.  Make sure that
any selected type promotes to int.  Ignore any macros YYTYPE_INT16,
YYTYPE_INT8, YYTYPE_UINT16, YYTYPE_UINT8 defined by the user.
(ptrdiff_t, PTRDIFF_MAX): Simplify in the light of the above.
(yytype_uint8, yytype_uint16): Do not assume that unsigned char
and unsigned short promote to int, as this isn’t true on some
platforms (e.g., TI TMS320C55x).
* src/parse-gram.y (YYTYPE_INT16, YYTYPE_INT8, YYTYPE_UINT16)
(YYTYPE_UINT8): Remove, as these are no longer effective.
2019-10-07 00:08:19 -07:00
Akim Demaille
58302c6079 regen 2019-10-06 17:48:51 +02:00
Akim Demaille
9e6c5328d3 diagnostics: also show suggested %empty
* src/reader.c (grammar_rule_check_and_complete): Suggest to add %empty.
* tests/actions.at, tests/diagnostics.at: Adjust expectations.
2019-10-06 12:15:12 +02:00
Akim Demaille
fec13ce2db diagnostics: sort symbols per location
Because the checking of the grammar is made by phases after the whole
grammar was read, we sometimes have diagnostics that look weird.  In
some case, within one type of checking, the entities are not checked
in the order in which they appear in the file.  For instance, checking
symbols is done on the list of symbols sorted by tag:

    foo.y:1.20-22: warning: symbol BAR is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                    ^~~
    foo.y:1.16-18: warning: symbol QUX is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                ^~~

Let's sort them by location instead:

    foo.y:1.16-18: warning: symbol 'QUX' is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                ^~~
    foo.y:1.20-22: warning: symbol 'BAR' is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                    ^~~

* src/location.h (location_cmp): Be robust to empty file names.
* src/symtab.c (symbol_cmp): Sort by location.
* tests/input.at: Adjust expectations.
2019-10-06 09:54:25 +02:00
Akim Demaille
be3cf406af diagnostics: suggest fixes for undeclared symbols
From

    input.y:1.17-19: warning: symbol baz is used, but is not defined as a token and has no rules [-Wother]
         1 | %printer {} foo baz
           |                 ^~~

to

    input.y:1.17-19: warning: symbol 'baz' is used, but is not defined as a token and has no rules; did you mean 'bar'? [-Wother]
        1 | %printer {} foo baz
          |                 ^~~
          |                 bar

* bootstrap.conf: We need fstrcmp.
* src/symtab.c (symbol_from_uniqstr_fuzzy): New.
(complain_symbol_undeclared): Use it.
* tests/diagnostics.at (Suggestions): New.
* data/bison-default.css (insertion): Rename as...
(fixit-insert): this, as this is what GCC uses.
2019-10-06 09:54:25 +02:00
Akim Demaille
126c4622de style: isolate complain_symbol_undeclared
* src/symtab.c (complain_symbol_undeclared): New.
Use it.
Use quote on the guilty symbol (like GCC does, and we also do
elsewhere).
* tests/input.at: Adjust.
2019-10-06 09:54:25 +02:00
Akim Demaille
dd64eaf9db style: simplify the handling of symbol and semantic_type tables
Both are stored in a hash, and back in the days, we used to iterate
over these tables using hash_do_for_each.  However, the order of
traversal was not deterministic, which was a nuisance for
deterministic output (and therefore also a problem for tests).  So at
some point (83b60c97ee) we generated a
sorted list of these symbols, and symbols_do actually iterated on that
list.  But we kept the constraints of using hash_do_for_each, which
requires a lot of ceremonial code, and makes it hard/unnatural to
preserve data between iterations (see the next commit).

Alas, this is C, not C++.

Let's remove this abstraction, and directly iterate on the sorted
tables.

* src/symtab.c (symbols_do): Remove.
Adjust callers to use a simple for-loop instead.
(table_sort): New.
(symbols_check_defined): Use it.
(symbol_check_defined_processor, symbol_pack_processor)
(semantic_type_check_defined_processor, symbol_translation_processor):
Remove.
Simplify the corresponding functions (that no longer need to return a
bool).
2019-10-06 09:54:20 +02:00
Akim Demaille
0b585c49ae diagnostics: display suggested update after the caret-info
This commit adds the suggestion in green, on the line below the
caret-and-tildes.

    foo.y:1.1-14: warning: deprecated directive: '%error-verbose', use '%define parse.error verbose' [-Wdeprecated]
        1 | %error-verbose
          | ^~~~~~~~~~~~~~
          | %define parse.error verbose

The current approach, with location_caret_suggestion, is fragile:
there's a protocol of calls to the complain functions which is strict.
We should rather have a richer structure describing the diagnostics,
including with submessages such as the suggestions, passed in the end
to the routines in charge of formatting and printing them.

* src/location.h, src/location.c (location_caret_suggestion): New.
* src/complain.c (deprecated_directive): Use it.
* tests/diagnostics.at, tests/input.at: Adjust expectations.
2019-10-06 08:07:57 +02:00
Akim Demaille
37c4d0b175 diagnostics: isolate caret_set_column
* src/location.c (caret_info): Add width and skip members.
(caret_set_column): New.
Use it.
2019-10-06 08:07:57 +02:00
Akim Demaille
56bcccbc51 diagnostics: isolate caret_set_file
* src/location.c (caret_set_file): New.
Store the current line's length in caret_info.line_len.
Pay attention to fseek's return value.
Extracted from...
(location_caret): here.
2019-10-06 08:07:57 +02:00
Paul Eggert
8f5aaa0e04 Avoid quiet conversion of pointer to bool
* src/location.c (caret_set_file):
* src/scan-code.l (contains_dot_or_dash):
Do not quietly convert pointer to bool, as Oracle Developer Studio
12.6 complains and it is arguably confusing style anyway.
2019-10-05 01:19:39 -07:00
Paul Eggert
b75b055288 Port ARGMATCH_DEFINE_GROUP calls to C99
* src/complain.c, src/getargs.c: Omit ‘;’ after call
to ARGMATCH_DEFINE_GROUP, as C99 does not allow ‘;’ there.
2019-10-05 01:19:39 -07:00