This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
* bootstrap.conf: We need winsz-ioctl and winsz-termios.
* src/location.c (columns): Use winsize to get the number of
columns.
Code taken from the GNU Coreutils.
* src/location.h, src/location.c (caret_init): New.
* src/complain.c (complain_init): Call it.
* tests/bison.in: Export COLUMNS so that users of tests/bison can
enjoy proper line truncation.
* src/location.c (min_int, columns): New.
(location_caret): Compute the line width. Based on it, compute how
many columns must be skipped before the quoted location and truncated
after, to fit the sceen width.
* tests/local.at (AT_QUELL_VALGRIND): Transform into...
(AT_SET_ENV_IF, AT_SET_ENV): these.
Define COLUMNS to protect the test suite from the user's environment.
So far diagnostics were cheating: in addition to the 'column' field of
locations (based on actual screen width per multibyte characters and
on tabulation expansion), the scanner sets the 'byte' field.
Diagnostics used this byte count to decide where to insert (color)
style.
We want to be able to truncate the quoted lines when there are too
wide to fit the screen. This requires that the diagnostics learn how
to count columns, the byte-in-boundary trick no longer works.
Bytes are still used for fix-its.
* bootstrap.conf: We need mbfile for mbf_getc.
* src/location.c (caret_info): We need an mbfile.
(caret_set_file): Initialize it.
(caret_getc): Convert to mbfile.
(location_caret): Instead of relying on the byte position to decide
where to insert the color style, count the current column using
boundary_compute.
* src/location.c (caret_info): Replace file and line with pos, a
boundary. This will allow us to use features of the boundary type,
such as boundary_compute.
The handling of the contributions of the tabulations in the columns is
burried inside location_compute. We will soon be willing to use the
boundary part of the computation (to compute the current column number
each time we read a multibyte char).
* src/location.c (boundary_compute): New, extracted from...
(location_compute): here.
We used to treat lone CRs (\r, aka ^M) as regular NLs (\n), probably
to please Classic MacOS. As of today, it makes more sense to treat \r
like a plain white space character.
https://lists.gnu.org/archive/html/bison-patches/2019-09/msg00027.html
* src/scan-gram.l (no_cr_read): Remove. Instead, use...
(eol): this new abbreviation denoting end-of-line.
* src/location.c (caret_getc): New.
(location_caret): Use it.
* tests/diagnostics.at (Carriage return): Adjust expectations.
(CR NL): New.
When the input file contains lone CRs (aka, ^M, \r), the locations see
a new line. Diagnostics look only at \n as end-of-line, so sometimes
there is an offset in diagnostics. Worse yet: sometimes we loop
endlessly waiting for \n to come from a continuous stream of EOF.
Fix that:
- check for EOF
- beware not to call end_use_class if begin_use_class was not
called (which would abort). This could happen if the actual
line is shorter that the expected one.
Prompted by a (private) report from Marc Schönefeld.
* src/location.c (location_caret): here.
* tests/diagnostics.at (Carriage return): New.
With
%token EOF 0 EOF 0
we get
input.y:3.14-16: warning: symbol EOF redeclared [-Wother]
3 | %token EOF 0 EOF 0
| ^~~
input.y:3.8-10: previous declaration
3 | %token EOF 0 EOF 0
| ^~~
Assertion failed: (nsyms == ntokens + nvars), function check_and_convert_grammar,
file /Users/akim/src/gnu/bison/src/reader.c, line 839.
Reported by Marc Schönefeld.
* src/symtab.c (symbol_user_token_number_set): Register only the
first definition of the end of input token.
* tests/input.at (Symbol redeclared): Check that case.
hash_initialize returns NULL when out of memory. Check for it, and
die cleanly instead of crashing.
Reported by 江 祖铭 (Zu-Ming Jiang).
https://lists.gnu.org/archive/html/bug-bison/2019-08/msg00015.html
* src/muscle-tab.c, src/state.c, src/symtab.c, src/uniqstr.c:
Check the value returned by hash_initialize.
https://lists.gnu.org/archive/html/bison-patches/2019-08/msg00007.html
When Bison is started with a flag that suppresses warning messages, the
error_message() function can produce a few gigabytes of indentation
because of a dangling pointer.
* src/complain.c (error_message): Don't reset indent_ptr here, but...
(complain_indent): here.
* tests/diagnostics.at (Indentation with message suppression): Check
this case.
See the previous commit. This option should be removed, -o suffices.
* src/getargs.c (FIXED_OUTPUT_FILES): New.
Add support for it.
(getargs): Define loc, and use it.
This is safer when we need to pass a pointer to a location.
The name fixed-output-files is pretty clear: generate y.tab.c, as Yacc
does. So let's detach this from %yacc which does more: it requires
POSIX Yacc behavior.
This directive is obsolete since December 29th 2001
8c9a50bee1. It does not show in the
doc. I don't want to spend more time on improving its diagnostics, it
could be removed just as well as far as I'm concerned.
* src/scan-gram.l, src/parse-gram.y (%fixed-output-files): Detach from
%yacc.
Years ago we moved from 'look-ahead' to 'lookahead', and that alias
was kept for backward compatibility. But now that we use argmatch to
generate the documentation, that value clutters the doc.
* src/getargs.c (argmatch_report_args): Remove the
--report=look-aheads alias.
The doc says that -Dfoo=bar is the same as %define foo "bar". It is
not: the quotes are not added (and it makes a difference).
* doc/bison.texi (Tuning the Parser): Fix the definition of -D/-F
* src/getargs.c (usage): Likewise.
Let's clarify --help: use clearer "section" names, as in the doc.
Move --yacc to where it belongs.
* src/getargs.c (usage): Rename "Parser" as "Tuning the Parser", as in
the doc.
Rename "Output" as "Output Files"
Move --yacc to "Tuning the Parser".
* doc/bison.texi: Likewise.
It can now generate the usage message.
* src/complain.h (feature_fixit_parsable): Rename as...
(feature_fixit): this, for column economy.
Adjust dependencies.
(warning_usage): New.
Use it.
* src/complain.h, src/complain.c, src/getargs.h, src/getargs.c:
Use ARGMATCH_DEFINE_GROUP instead of the older interface.
The code is inconsistent: sometimes we pass by value, sometimes by
reference. Let's stick to the last, more conventional for large
values in C.
* src/scan-code.l: Pass locations by reference.
Sadly enough, AFAIK, there were never answers to the "More user
feedback will help to stabilize it" sentences. Remove them.
* src/getargs.c: IELR, canonical LR and XML output are here to stay,
and they are no more experimental than some other features.
* doc/bison.texi: Likewise.
Also remove "experimental" warning for Java, LAC, LR tuning options,
and named references.
This is an experiment. Maybe more styles will be used (in which case
a short-hand function will be useful), maybe it will be just reverted.
* data/bison-default.css (.traces0): New.
* src/lalr.c (lalr): Use it.