Commit Graph

2578 Commits

Author SHA1 Message Date
Akim Demaille
60ebd8e210 regen 2019-11-16 12:54:44 +01:00
kaneko y
3765e3e790 gram.c: Fix condition of aver
* src/gram.c (grammar_dump): Fix condition of aver.
What we want to check is that rhs is followed by its rule.
2019-11-12 08:39:28 +01:00
Yuichiro Kaneko
17d34c231b gram.c: also print terminals in grammar_dump
* src/gram.c (grammar_dump): Print terminals likewise non terminals.
* tests/sets.at (Reduced Grammar): Update test case to catch up the
change and add a test case where prec and assoc are used.
2019-11-11 10:37:30 +01:00
Akim Demaille
cce6c998b6 diagnostics: add missing translation
* src/muscle-tab.c (muscle_percent_define_check_kind): Here.
2019-11-03 09:24:12 +01:00
Akim Demaille
c53b379784 style: fix cpp indentation
Reported by syntax-check.

* src/system.h: here.
2019-10-29 09:00:46 +01:00
Akim Demaille
8228d96d33 reader: reduce the "scope" of global variables
We have too many global variables, adding structure would help.  For a
start, let's hide some of the variables closer to their usage.

* src/getargs.c, src/files.h (current_file): Move to...
* src/scan-gram.c: here.
* src/scan-gram.h (gram_in, gram__flex_debug): Remove, make them
private to the scanner.
* src/reader.h, src/reader.c (reader): Take a grammar file as argument.
Move the handling of scanner variables to...
* src/scan-gram.l (gram_scanner_open, gram_scanner_close): here.
(gram_scanner_initialize): Remove, replaced by gram_scanner_open.
* src/main.c: Adjust.
2019-10-26 10:39:01 +02:00
Akim Demaille
a5fc4e3b44 regen 2019-10-26 10:39:01 +02:00
Akim Demaille
3be912e4af parser: use grammar_file instead of current_file
* src/parse-gram (%initial-action): here.
(handle_skeleton): Don't depend on the current file name to look for
"local" skeletons (subject to changes coming from "#lines"): depend
only on the initial file name, the one given on the command line.
2019-10-26 10:38:39 +02:00
Akim Demaille
4b4e532748 diagnostics: use grammar_file instead of current_file
Currently there are two globals denoting the input file: grammar_file
is the one from the command line, and current_file which might change
because of #line.  Use only the former.

* src/complain.c (error_message): here.
* tests/diagnostics.at: Adjust.
2019-10-26 09:11:40 +02:00
Akim Demaille
6e7d8ba6a7 reader: let symtab deal with the symbols
* src/reader.c (reader): Move the setting up of the builtin symbols to...
* src/symtab.c (symbols_new): here.
2019-10-25 07:48:07 +02:00
Akim Demaille
c680300a29 style: remove incorrect comment
Reported by Paul Eggert.

* src/system.h: here.
2019-10-25 07:41:38 +02:00
Akim Demaille
fa9871a2fb diagnostics: simplify location handling
Locations start at line 1.  Don't accept line 0.

* src/location.c (location_print): Don't print locations with line 0.
(location_caret): Simplify.
2019-10-24 18:00:43 +02:00
Akim Demaille
76597d01f3 build: reenable -Wtype-limits
See https://lists.gnu.org/archive/html/bug-bison/2019-10/msg00061.html
to https://lists.gnu.org/archive/html/bug-bison/2019-10/msg00073.html.

Paul Eggert's changes in gnulib do fix the issue for modern GCCs (7,
8, 9) on macOS.  Unfortunately these warnings are back on the
CI (GNU/Linux) with GCC 4.6, 4.7, (not 4.8) and 4.9.

Disable the warning locally.

* configure.ac (warn_common, warn_tests): Remove -Wtype-limits.
* src/system.h (IGNORE_TYPE_LIMITS_BEGIN, IGNORE_TYPE_LIMITS_END): New.
* src/InadequacyList.c, src/parse-gram.c, src/parse-gram.y,
* src/symtab.c: Use it.
2019-10-24 08:50:14 +02:00
Akim Demaille
bc5efb558d build: remove dmalloc support
Today sanitizers are a better alternative.

* m4/dmalloc.m4: Remove.
* configure.ac, src/system.h: Adjust.
2019-10-24 07:22:17 +02:00
Yuichiro Kaneko
3945beb1d2 style: update comment in reader.c
rrhs and rlhs were removed by b2ed6e5826.

* src/reader.c (packgram): Update comment.
2019-10-23 08:32:06 +02:00
Akim Demaille
048730c691 style: pacify syntax-check
* doc/.gitignore, src/complain.c, src/getargs.c,
* src/output.c: here.
2019-10-22 10:40:12 +02:00
Akim Demaille
ec64a0bc7e main: also free memory on errors
* src/derives.c (derives_free): Beware of NULL.
* src/main.c (main): Let the 'finish' label include memory release.
2019-10-21 17:18:32 +02:00
Akim Demaille
d76ea5ce06 style: reduce scope in derives
* src/derives.c: here.
And prefer prefix to postfix increment.
2019-10-21 17:18:32 +02:00
Akim Demaille
97d6da0c5b parser: clarify version checking
* src/parse-gram.y: Use the same conventions for gnulib as elsewhere:
<header.h>.
(str_to_version): New.
(handle_require): Use it.
Prefer < to >.
2019-10-20 17:57:28 +02:00
Paul Eggert
693e69f289 regen 2019-10-17 11:51:20 -07:00
Paul Eggert
8a4ec5d4e4 bison: check for int overflow in token numbers
* src/symtab.c: Include intprops.h
(symbol_user_token_number_set): Don’t allow user_token_number ==
INT_MAX because too much other code adds 1 to the user token number.
(symbols_token_translations_init): Complain on integer overflow
instead of indulging in undefined behavior.
2019-10-17 11:51:20 -07:00
Paul Eggert
052215a138 bison: check for int overflow when scanning
* src/scan-gram.l: Include errno.h, for errno.
(scan_integer, handle_syncline): Check for integer overflow.
* tests/input.at (too-large.y): Adjust to match new diagnostics.
2019-10-17 11:51:20 -07:00
Paul Eggert
15c1b913cf bison: check version numbers more carefully
* src/parse-gram.y: Include intprops.h.
(handle_require): Don’t indulge in undefined behavior if the major
or minor number is out of range.  Instead, check that the
resulting value is nonnegative, fits in int, and that the minor
number is less than 100.  Also, check that a number was parsed.
2019-10-17 11:51:20 -07:00
Akim Demaille
d9d37a1196 i18n: don't push too hard for '…'
Suggested by Paul Eggert.

* src/location.c (ellipsis): Clarify comment for translators.
2019-10-12 10:43:53 +02:00
Akim Demaille
c3db1394a1 regen 2019-10-11 08:52:04 +02:00
Akim Demaille
2c66acfec0 diagnostics: prefer "…" to "..." if the locale supports it
* src/location.c (ellipsis, ellipsize): New.
Use them.
2019-10-10 21:57:50 +02:00
Paul Eggert
5463291a91 Use “least” types for integers in Yacc tables
This changes the Yacc skeleton to use “least” integer types to
keep tables smaller on some platforms, which should lessen cache
pressure.  Since Bison uses the Yacc skeleton, it follows suit.
* data/skeletons/yacc.c: Include limits.h and stdint.h if this
seems to be needed.
(yytype_uint8, yytype_int8, yytype_uint16, yytype_int16):
If available, use GCC predefined macros __INT_MAX__ etc. to select
a “least” type, as this avoids namespace hassles.  Otherwise, if
available fall back on selecting a “least” type via the C99 macros
INT_MAX, INT_LEAST8_MAX, etc.  Otherwise, fall further back on one of
the builtin C99 types signed char, short, and int.  Make sure that
any selected type promotes to int.  Ignore any macros YYTYPE_INT16,
YYTYPE_INT8, YYTYPE_UINT16, YYTYPE_UINT8 defined by the user.
(ptrdiff_t, PTRDIFF_MAX): Simplify in the light of the above.
(yytype_uint8, yytype_uint16): Do not assume that unsigned char
and unsigned short promote to int, as this isn’t true on some
platforms (e.g., TI TMS320C55x).
* src/parse-gram.y (YYTYPE_INT16, YYTYPE_INT8, YYTYPE_UINT16)
(YYTYPE_UINT8): Remove, as these are no longer effective.
2019-10-07 00:08:19 -07:00
Akim Demaille
58302c6079 regen 2019-10-06 17:48:51 +02:00
Akim Demaille
9e6c5328d3 diagnostics: also show suggested %empty
* src/reader.c (grammar_rule_check_and_complete): Suggest to add %empty.
* tests/actions.at, tests/diagnostics.at: Adjust expectations.
2019-10-06 12:15:12 +02:00
Akim Demaille
fec13ce2db diagnostics: sort symbols per location
Because the checking of the grammar is made by phases after the whole
grammar was read, we sometimes have diagnostics that look weird.  In
some case, within one type of checking, the entities are not checked
in the order in which they appear in the file.  For instance, checking
symbols is done on the list of symbols sorted by tag:

    foo.y:1.20-22: warning: symbol BAR is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                    ^~~
    foo.y:1.16-18: warning: symbol QUX is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                ^~~

Let's sort them by location instead:

    foo.y:1.16-18: warning: symbol 'QUX' is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                ^~~
    foo.y:1.20-22: warning: symbol 'BAR' is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                    ^~~

* src/location.h (location_cmp): Be robust to empty file names.
* src/symtab.c (symbol_cmp): Sort by location.
* tests/input.at: Adjust expectations.
2019-10-06 09:54:25 +02:00
Akim Demaille
be3cf406af diagnostics: suggest fixes for undeclared symbols
From

    input.y:1.17-19: warning: symbol baz is used, but is not defined as a token and has no rules [-Wother]
         1 | %printer {} foo baz
           |                 ^~~

to

    input.y:1.17-19: warning: symbol 'baz' is used, but is not defined as a token and has no rules; did you mean 'bar'? [-Wother]
        1 | %printer {} foo baz
          |                 ^~~
          |                 bar

* bootstrap.conf: We need fstrcmp.
* src/symtab.c (symbol_from_uniqstr_fuzzy): New.
(complain_symbol_undeclared): Use it.
* tests/diagnostics.at (Suggestions): New.
* data/bison-default.css (insertion): Rename as...
(fixit-insert): this, as this is what GCC uses.
2019-10-06 09:54:25 +02:00
Akim Demaille
126c4622de style: isolate complain_symbol_undeclared
* src/symtab.c (complain_symbol_undeclared): New.
Use it.
Use quote on the guilty symbol (like GCC does, and we also do
elsewhere).
* tests/input.at: Adjust.
2019-10-06 09:54:25 +02:00
Akim Demaille
dd64eaf9db style: simplify the handling of symbol and semantic_type tables
Both are stored in a hash, and back in the days, we used to iterate
over these tables using hash_do_for_each.  However, the order of
traversal was not deterministic, which was a nuisance for
deterministic output (and therefore also a problem for tests).  So at
some point (83b60c97ee) we generated a
sorted list of these symbols, and symbols_do actually iterated on that
list.  But we kept the constraints of using hash_do_for_each, which
requires a lot of ceremonial code, and makes it hard/unnatural to
preserve data between iterations (see the next commit).

Alas, this is C, not C++.

Let's remove this abstraction, and directly iterate on the sorted
tables.

* src/symtab.c (symbols_do): Remove.
Adjust callers to use a simple for-loop instead.
(table_sort): New.
(symbols_check_defined): Use it.
(symbol_check_defined_processor, symbol_pack_processor)
(semantic_type_check_defined_processor, symbol_translation_processor):
Remove.
Simplify the corresponding functions (that no longer need to return a
bool).
2019-10-06 09:54:20 +02:00
Akim Demaille
0b585c49ae diagnostics: display suggested update after the caret-info
This commit adds the suggestion in green, on the line below the
caret-and-tildes.

    foo.y:1.1-14: warning: deprecated directive: '%error-verbose', use '%define parse.error verbose' [-Wdeprecated]
        1 | %error-verbose
          | ^~~~~~~~~~~~~~
          | %define parse.error verbose

The current approach, with location_caret_suggestion, is fragile:
there's a protocol of calls to the complain functions which is strict.
We should rather have a richer structure describing the diagnostics,
including with submessages such as the suggestions, passed in the end
to the routines in charge of formatting and printing them.

* src/location.h, src/location.c (location_caret_suggestion): New.
* src/complain.c (deprecated_directive): Use it.
* tests/diagnostics.at, tests/input.at: Adjust expectations.
2019-10-06 08:07:57 +02:00
Akim Demaille
37c4d0b175 diagnostics: isolate caret_set_column
* src/location.c (caret_info): Add width and skip members.
(caret_set_column): New.
Use it.
2019-10-06 08:07:57 +02:00
Akim Demaille
56bcccbc51 diagnostics: isolate caret_set_file
* src/location.c (caret_set_file): New.
Store the current line's length in caret_info.line_len.
Pay attention to fseek's return value.
Extracted from...
(location_caret): here.
2019-10-06 08:07:57 +02:00
Paul Eggert
8f5aaa0e04 Avoid quiet conversion of pointer to bool
* src/location.c (caret_set_file):
* src/scan-code.l (contains_dot_or_dash):
Do not quietly convert pointer to bool, as Oracle Developer Studio
12.6 complains and it is arguably confusing style anyway.
2019-10-05 01:19:39 -07:00
Paul Eggert
b75b055288 Port ARGMATCH_DEFINE_GROUP calls to C99
* src/complain.c, src/getargs.c: Omit ‘;’ after call
to ARGMATCH_DEFINE_GROUP, as C99 does not allow ‘;’ there.
2019-10-05 01:19:39 -07:00
Paul Eggert
67dcef357c regen 2019-10-02 17:11:33 -07:00
Paul Eggert
133edcd248 Prefer signed to unsigned integers
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. 	Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
2019-10-02 17:11:33 -07:00
Akim Demaille
67bff62e31 diagnostics: get the screen width from the terminal
* bootstrap.conf: We need winsz-ioctl and winsz-termios.
* src/location.c (columns): Use winsize to get the number of
columns.
Code taken from the GNU Coreutils.
* src/location.h, src/location.c (caret_init): New.
* src/complain.c (complain_init): Call it.
* tests/bison.in: Export COLUMNS so that users of tests/bison can
enjoy proper line truncation.
2019-09-22 09:12:08 +02:00
Akim Demaille
5f45cb05f1 diagnostics: don't print ellipsis on the caret line
From

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHIJKL
      | ...         ^~~~~~~~~~~~~~~~~~~~~~~~~~

to

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHI...
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~

* src/location.c (location_caret): here.
* tests/diagnostics.at: Adjust expectations.
2019-09-22 09:12:08 +02:00
Akim Demaille
b61b0eb9ac diagnostics: also show truncation at the end of line with "..."
From

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHIJKL
      | ...         ^~~~~~~~~~~~~~~~~~~~~~~~~~

to

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHI...
      | ...         ^~~~~~~~~~~~~~~~~~~~~~~~~~

* src/location.c (location_caret): here.
* tests/diagnostics.at: Adjust expectations.
2019-09-22 09:12:08 +02:00
Akim Demaille
f716484627 diagnostics: truncate quoted sources to fit the screen
* src/location.c (min_int, columns): New.
(location_caret): Compute the line width.  Based on it, compute how
many columns must be skipped before the quoted location and truncated
after, to fit the sceen width.
* tests/local.at (AT_QUELL_VALGRIND): Transform into...
(AT_SET_ENV_IF, AT_SET_ENV): these.
Define COLUMNS to protect the test suite from the user's environment.
2019-09-22 09:12:08 +02:00
Akim Demaille
945b917da2 diagnostics: learn how to count column number with multibyte chars
So far diagnostics were cheating: in addition to the 'column' field of
locations (based on actual screen width per multibyte characters and
on tabulation expansion), the scanner sets the 'byte' field.
Diagnostics used this byte count to decide where to insert (color)
style.

We want to be able to truncate the quoted lines when there are too
wide to fit the screen.  This requires that the diagnostics learn how
to count columns, the byte-in-boundary trick no longer works.

Bytes are still used for fix-its.

* bootstrap.conf: We need mbfile for mbf_getc.
* src/location.c (caret_info): We need an mbfile.
(caret_set_file): Initialize it.
(caret_getc): Convert to mbfile.
(location_caret): Instead of relying on the byte position to decide
where to insert the color style, count the current column using
boundary_compute.
2019-09-22 09:12:08 +02:00
Akim Demaille
1ef407d923 diagnostics: style: rename member for clariy
* src/location.c (caret_info): Now that we no longer have a 'file'
member (see previous commit), rename 'source' as 'file'.
2019-09-22 09:12:08 +02:00
Akim Demaille
576b863e91 diagnostics: style: use a boundary to track the caret_info
* src/location.c (caret_info): Replace file and line with pos, a
boundary.  This will allow us to use features of the boundary type,
such as boundary_compute.
2019-09-22 09:12:08 +02:00
Akim Demaille
2274c34e91 diagnostics: extract boundary_compute from location_compute
The handling of the contributions of the tabulations in the columns is
burried inside location_compute.  We will soon be willing to use the
boundary part of the computation (to compute the current column number
each time we read a multibyte char).

* src/location.c (boundary_compute): New, extracted from...
(location_compute): here.
2019-09-22 09:12:08 +02:00
Akim Demaille
fccab9bc40 diagnostics: style: add caret_set_file
To make the following commits easier to read.

* src/location.c (caret_set_file): New.
2019-09-22 09:12:08 +02:00
Akim Demaille
488607534a diagnostics: style: minor changes
* src/location.c (location_caret): Factor two branches of an if.
2019-09-22 09:12:08 +02:00