It is not used. And its implementation was wrong when api.token.raw
was defined, as it was still mapping to the external token numbers,
instead of the internal ones. Besides it was provided only when
api.token.constructor is defined, yet always declared.
* data/skeletons/c++.m4 (by_type::token): Remove, useless.
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html
The cast is needed when yytranslate_'s argument type is token_type,
i.e., when api.token.constructor is defined.
373. types.at:138: testing lalr1.cc api.value.type=variant api.token.constructor ...
======== Testing with C++ standard flags: ''
../../tests/types.at:138: bison --color=no -fno-caret -o test.cc test.y
../../tests/types.at:138: $CXX $CXXFLAGS $CPPFLAGS $LDFLAGS -o test test.cc $LIBS
stderr:
test.cc:966:16: error: result of comparison of constant 257 with
expression of type 'yy::parser::token_type'
(aka 'yy::parser::token::yytokentype') is always true
[-Werror,-Wtautological-constant-out-of-range-compare]
else if (t <= user_token_number_max_)
~ ^ ~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
It is because it is expected that when api.token.constructor is
defined, only symbol constructors will be used, that yytranslate_ then
takes a token_type. But it is wrong: we still allow literal
characters in this case, as demonstrated by test 373 for instance.
%define api.value.type variant
%define api.token.constructor
%token <std::pair<int, int>> '1' '2';
[...]
static yy::parser::symbol_type yylex ()
{
static char const input[] = "12";
int res = input[toknum++];
typedef yy::parser::symbol_type symbol;
if (res)
return symbol (res, std::make_pair (res - '0', res - '0' + 1));
else
return symbol (res);
}
So let yytranslate_ always take an int, which makes the cast truly
useless.
* data/skeletons/c++.m4, data/skeletons/lalr1.cc (yytranslate_): here.
The C++ implementation of LAC did not skip the $undefined token,
probably because it was not exposed. Expose it, and use clearer
names.
* data/skeletons/c++.m4: Don't define undef_token_ in yytranslate_,
but...
* data/skeletons/lalr1.cc (yy_undef_token_): here.
Use a more precise type to define yy_undef_token_ and yy_error_token_.
Unfortunately we move from a compile-time value defined via an enum to
a static const member. Eventually we should make it constexpr.
Make LAC implementation more alike yacc.c's one.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java: Don't expose
yyuser_token_number_max_ and yyundef_token_. Do as in C++: scope them
into yytranslate_, and only when api.token.raw is not defined.
(yyterror_): Rename as...
(yy_error_token_): this.
* data/skeletons/lalr1.d (token_number_type): New.
Use it.
Can't be done in the Java backend, as Java does not have type aliases.
* data/skeletons/lalr1.d, data/skeletons/lalr1.java (yytoken_number_):
Remove, useless.
Was used in ancient C skeletons to support YYPRINT, long obsoleted by
%printer.
It is not used at all. We will remove it also from yacc.c, but
later (see TODO).
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/lalr1.java (yyerrcode_):
Remove.
Reported by Frank Heckenbach.
https://lists.gnu.org/archive/html/bug-bison/2019-11/msg00016.html
* data/skeletons/c++.m4 (b4_yytranslate_define): Don't use yyeof_ as
if it had two different types.
It is used once against the input argument, which is the value
returned by yylex, which is an "external token number", typically an
int. It is also used as output type, an "internal symbol number".
It turns out that in both cases we mean "0", but let's keep yyeof_
only for the case "internal symbol number", i.e., _after_ conversion
by yytranslate.
This frees us from one cast.
The current code for yysyntax_error for %define parse.error verbose is
fishy (given that YYEMPTY is -2, invalid argument for yytname[]):
static int
yysyntax_error ([...])
{
YYPTRDIFF_T yysize0 = yytnamerr (YY_NULLPTR, yytname[yytoken]);
[...]
if (yytoken != YYEMPTY)
A nearby comment reports
The only way there can be no lookahead present (in yychar) is if
this state is a consistent state with a default action. Thus,
detecting the absence of a lookahead is sufficient to determine
that there is no unexpected or expected token to report. In that
case, just report a simple "syntax error".
So it _is_ possible to call yysyntax_error with yytoken == YYEMPTY,
albeit quite difficult when meaning to, so virtually impossible by
accident (after all, there was never a bug report about this).
I failed to produce a test case, but Joel E. Denny provided me with
one (added to the test suite below). The yacc.c skeleton fails on
this, and once fixed dies on a second problem. The glr.c skeleton was
also dying, but immediately of this second problem.
Indeed we were not allocating space for the error message's final \0.
This was hidden by the fact that we only had error messages with at
least an unexpected token displayed, so with at least one "%s" in the
format string, whose size (2) was included (incorrectly) in the final
size of the message (where the %s have been replaced by the actual
content).
* data/skeletons/glr.c, data/skeletons/yacc.c (yysyntax_error):
Do not invoke yytnamerr on YYEMPTY.
Clarify the computation of the length of the _final_ error message,
with the NUL terminator but without the '%s's.
* tests/conflicts.at (Syntax error in consistent error state):
New, contributed by Joel E. Denny.
We still have a few old C casts in lalr1.cc, let's get rid of them.
Reported by Frank Heckenbach.
Actually, let's monitor all our casts using easy to grep macros.
Let's use these macros to use the C++ standard casts when we are in
C++.
* data/skeletons/c.m4 (b4_cast_define): New.
* data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc, data/skeletons/stack.hh,
* data/skeletons/yacc.c:
Use it and/or its casts.
* tests/actions.at, tests/cxx-type.at,
* tests/glr-regression.at, tests/headers.at, tests/torture.at,
* tests/types.at:
Use YY_CAST instead of C casts.
* configure.ac (warn_cxx): Add -Wold-style-cast.
* doc/bison.texi: Disable it.
* data/skeletons/location.cc: Remove the u (for unsigned) suffix from
the initial line and column.
* NEWS: AFAICT, only C++ backends have their location types changed.
This skeleton uses a single stack of state structures, so it is less
likely to benefit from a stack size reduction than yacc.c (which uses
several stacks: state number, value and location). But it will reduce
the size of the LAC stack.
This skeleton was already using int for state numbers, so, contrary to
yacc.c, this brings nothing for large automata.
Overall, it is still nicer to make the skeletons alike.
* data/skeletons/lalr1.cc (state_type): Here.
The documentation for Oracle Solaris Studio 12.3 (Sun C++ 5.12
2011/11/16) says it supports C++03. This compiler rejects the
location.cc use of std::max for some reason; I don’t know why
since I don’t use C++ as a rule. The simplest workaround is to
open-code ‘max’.
* data/skeletons/location.cc (add_):
Do max by hand rather than relying on std::max.
Don’t include <algorithm.h>; no longer needed.
Sun C 5.12 defines __SUNPRO_C to 0x5120 but diagnoses
‘__attribute__ ((__unused__))’. Change the ifdefs to use
the same method as Gnulib in this area.
* data/skeletons/c.m4 (YY_ATTRIBUTE): Remove, since
not all attributes were added in the same compiler version.
(YY_ATTRIBUTE_PURE, YY_ATTRIBUTE_UNUSED):
Use specific GCC version for each attribute.
Pay no attention to __SUNPRO_C.
* tests/headers.at (Several parsers): Tighten tests accordingly.
Oracle Solaris Studio 12.3 (Sun C 5.12 2011/11/16) by default does
not conform to C99; it defines __STDC_VERSION__ to be 199409L, so
the Bison code does not include <stdint.h> (not required by C89
amendment 1) even though this compiler does have <stdint.h>. On
this platform <limits.h> defines INT_LEAST8_MAX (POSIX allows
this) so the skeleton got confused and thought that <stdint.h> had
been included even though it wasn’t.
* data/skeletons/c.m4 (b4_c99_int_type_define) [!__PTRDIFF_MAX__]:
Always include <limits.h>.
(YY_STDINT_H): Define when <stdint.h> was included.
All uses of expressions like ‘defined INT_LEAST8_MAX’ changed to
‘defined YY_STDINT_H’, since Sun C 5.12 <limits.h> defines macros
like INT_LEAST8_MAX but does not declare types like int_least8_t.
* data/skeletons/c.m4 (b4_c99_int_type_define): Reorder to put the
signed types first, since they’re simpler and this keeps similar
code closer. For signed types, don’t bother checking whether the
type promotes to int since the type must be signed anyway. For
unsigned types, protect a test like ‘UCHAR_MAX <= INT_MAX’ with
‘!defined __UINT_LEAST8_MAX__’, as otherwise the logic is wrong
for oddball platforms; and once we do that, there should no need
for ‘defined INT_MAX’ so remove that.
A number of portability issues with GCC 4.6 .. 4.9 (inclusive):
input.c:184:7: error: "UCHAR_MAX" is not defined [-Werror=undef]
#elif UCHAR_MAX <= INT_MAX
^
input.c:184:20: error: "INT_MAX" is not defined [-Werror=undef]
#elif UCHAR_MAX <= INT_MAX
^
input.c:202:7: error: "USHRT_MAX" is not defined [-Werror=undef]
#elif USHRT_MAX <= INT_MAX
^
input.c:202:20: error: "INT_MAX" is not defined [-Werror=undef]
#elif USHRT_MAX <= INT_MAX
^
* data/skeletons/c.m4 (b4_c99_int_type_define): Don't assume they are
defined.
That way, glr.c can use it too.
* data/skeletons/c.m4 (b4_int_type):
Do not special-case ‘char’; it’s not worth the trouble,
as clang complains about char subscripts.
(b4_c99_int_type, b4_c99_int_type_define): New macros,
taken from yacc.c.
* data/skeletons/glr.c: Use b4_int_type_define.
* data/skeletons/yacc.c (b4_int_type): Remove, since there’s
no longer any need to redefine it.
Use b4_c99_int_type_define rather than its body.
This changes the Yacc skeleton to use “least” integer types to
keep tables smaller on some platforms, which should lessen cache
pressure. Since Bison uses the Yacc skeleton, it follows suit.
* data/skeletons/yacc.c: Include limits.h and stdint.h if this
seems to be needed.
(yytype_uint8, yytype_int8, yytype_uint16, yytype_int16):
If available, use GCC predefined macros __INT_MAX__ etc. to select
a “least” type, as this avoids namespace hassles. Otherwise, if
available fall back on selecting a “least” type via the C99 macros
INT_MAX, INT_LEAST8_MAX, etc. Otherwise, fall further back on one of
the builtin C99 types signed char, short, and int. Make sure that
any selected type promotes to int. Ignore any macros YYTYPE_INT16,
YYTYPE_INT8, YYTYPE_UINT16, YYTYPE_UINT8 defined by the user.
(ptrdiff_t, PTRDIFF_MAX): Simplify in the light of the above.
(yytype_uint8, yytype_uint16): Do not assume that unsigned char
and unsigned short promote to int, as this isn’t true on some
platforms (e.g., TI TMS320C55x).
* src/parse-gram.y (YYTYPE_INT16, YYTYPE_INT8, YYTYPE_UINT16)
(YYTYPE_UINT8): Remove, as these are no longer effective.
* data/skeletons/yacc.c (YYPTRDIFF_T, YYPTRDIFF_MAXIMUM):
Default to long, not int.
(yy_lac_stack_realloc, yy_lac, yytnamerr, yyparse):
Avoid casts to YYPTRDIFF_T that were masking the problem.
input.c: In function 'int yyparse()':
input.c: error: conversion to 'long int' from 'long unsigned int'
may change the sign of the result [-Werror=sign-conversion]
yyes_capacity = sizeof yyesa / sizeof *yyes;
^
cc1plus: all warnings being treated as errors
* data/skeletons/yacc.c: here.
For instance with GCC 4.9 and --enable-gcc-warnings:
25. input.at:1201: testing Torturing the Scanner ...
../../tests/input.at:1344: $CC $CFLAGS $CPPFLAGS -c -o input.o input.c
stderr:
input.c:239:18: error: "__STDC_VERSION__" is not defined [-Werror=undef]
# elif 199901 <= __STDC_VERSION__
^
input.c:256:18: error: "__STDC_VERSION__" is not defined [-Werror=undef]
# elif 199901 <= __STDC_VERSION__
^
* data/skeletons/yacc.c: Check that __STDC_VERSION__ is defined before
using it.
On the CI with GCC 6:
examples/c++/calc++/parser.cc:845:5: error: 'ptrdiff_t' was not declared in this scope
ptrdiff_t yycount = 0;
^~~~~~~~~
examples/c++/calc++/parser.cc:845:5: note: suggested alternatives:
/usr/include/x86_64-linux-gnu/c++/6/bits/c++config.h:202:28: note: 'std::ptrdiff_t'
typedef __PTRDIFF_TYPE__ ptrdiff_t;
^~~~~~~~~
* data/skeletons/lalr1.cc: Qualify ptrdiff_t and size_t with std::.
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
* NEWS: Mention this.
* data/skeletons/c.m4 (b4_int_type):
Prefer char if it will do, and prefer signed types to unsigned if
either will do.
* data/skeletons/glr.c (yy_reduce_print): No need to
convert rule line to unsigned long.
(yyrecoverSyntaxError): Put action into an int to
avoid GCC warning of using a char subscript.
* data/skeletons/lalr1.cc (yy_lac_check_, yysyntax_error_):
Prefer ptrdiff_t to size_t.
* data/skeletons/yacc.c (b4_int_type):
Prefer signed types to unsigned if either will do.
* data/skeletons/yacc.c (b4_declare_parser_state_variables):
(YYSTACK_RELOCATE, YYCOPY, yy_lac_stack_realloc, yy_lac)
(yytnamerr, yysyntax_error, yyparse): Prefer ptrdiff_t to size_t.
(YYPTRDIFF_T, YYPTRDIFF_MAXIMUM): New macros.
(YYSIZE_T): Fix "! defined YYSIZE_T" typo.
(YYSIZE_MAXIMUM): Take the minimum of PTRDIFF_MAX and SIZE_MAX.
(YYSIZEOF): New macro.
(YYSTACK_GAP_MAXIMUM, YYSTACK_BYTES, YYSTACK_RELOCATE)
(yy_lac_stack_realloc, yyparse): Use it.
(YYCOPY, yy_lac_stack_realloc): Cast to YYSIZE_T to pacify GCC.
(yy_reduce_print): Use int instead of unsigned long when int
will do.
(yy_lac_stack_realloc): Prefer long to unsigned long when
either will do.
* tests/regression.at: Adjust to these changes.
Currently we properly use the "best" integral type for tables,
including those storing state numbers. However the variables for
state numbers used in yyparse (and its dependencies such as
yy_stack_print) still use int16_t invariably. As a consequence, very
large models overflow these variables.
Let's use the "best" type for these variables too. It turns out that
we can still use 16 bits for twice larger automata: stick to unsigned
types.
However using 'unsigned' when 16 bits are not enough is troublesome
and generates tons of warnings about signedness issues. Instead,
let's use 'int'.
Reported by Tom Kramer.
https://lists.gnu.org/archive/html/bug-bison/2019-09/msg00018.html
* data/skeletons/yacc.c (b4_state_num_type): New.
(yy_state_num): Be computed from YYNSTATES.
* tests/linear: New.
* tests/torture.at (State number type): New.
Use it.
Instead of
#define YYPACT_NINF -130
#define yypact_value_is_default(Yystate) \
(!!((Yystate) == (-130)))
generate
#define YYPACT_NINF (-130)
#define yypact_value_is_default(Yyn) \
((Yyn) == YYPACT_NINF)
* data/skeletons/c.m4 (b4_table_value_equals): Add support for $4.
* data/skeletons/glr.c, data/skeletons/yacc.c: Use it.
Also, use shorter macro argument names, the name of the macro is clear
enough.
* upstream/maint:
c++: add copy ctors for compatibility with the IAR compiler
CI: show git status
CI: disable ICC
tests: pass -jN from Make to the test suite
quotearg: avoid leaks
maint: post-release administrivia
Reported by Andreas Damm.
https://savannah.gnu.org/support/?110032
* data/skeletons/lalr1.cc (stack_symbol_type::operator=): New
overload, const, to please the IAR C++ compiler (version ca 2013).