Currently pstate_new does not set up its variables, this task is left
to yypush_parse. This was probably to share more code with usual pull
parsers, where these (local) variables are indeed initialized by
yyparse.
But as a consequence yyexpected_tokens crashes at the very beginning
of the parse, since, for instance, the stacks are not even set up.
See https://lists.gnu.org/r/bison-patches/2020-03/msg00001.html.
The fix could have very simple, but the documentation actually makes
it very clear that we can reuse a pstate for several parses:
After yypush_parse returns a status other than YYPUSH_MORE, the
parser instance yyps may be reused for a new parse.
so we need to restore the parser to its pristine state so that (i) it
is ready to run the next parse, (ii) it properly supports
yyexpected_tokens for the next run.
* data/skeletons/yacc.c (b4_initialize_parser_state_variables): New,
extracted from the top of yyparse/yypush_parse.
(yypstate_clear): New.
(yypstate_new): Use it when push parsers are enabled.
Define after the yyps macros so that we can use the same code as the
regular pull parsers.
(yyparse): Use it when push parsers are _not_ enabled.
* examples/c/bistromathic/bistromathic.test: Check the completion on
the beginning of the line.
Currently completion on "at" proposes only "atan", but does not
actually complete "at" into "atan".
* examples/c/bistromathic/parse.y (completion): Install the lcp in
matches[0].
* examples/c/bistromathic/bistromathic.test: Check that case.
Currently "(1+<TAB>" does not work as expected, because "+" is not a
word breaking character.
* examples/c/bistromathic/parse.y (init_readline): Specify our word
breaking characters.
* examples/c/bistromathic/bistromathic.test: Avoid trailing spaces.
Let's use GNU readline and its TAB autocompletion to demonstrate the
use of yyexpected_tokens.
This shows a number of weaknesses in our current approach:
- some macros (yyssp, etc.) from push parsers "leak" in user code, we
need to undefine them
- the context needed by yyexpected_tokens does not need the token,
yypstate actually suffices
- yypstate is not properly setup when first allocated, which results
in a crash of yyexpected_tokens if fired before a first token was
read. We should move initialization from yypush_parse into
yypstate_new.
* examples/c/bistromathic/parse.y (yylex): Take input as a string, not
a file.
(EXIT): New token.
(input): Adjust to work only on a line.
(line): Remove.
(symbol_count, process_line, expected_tokens, completion)
(init_readline): New.
* examples/c/bistromathic/bistromathic.test: Adjust expectations.
This example will soon use GNU readline, so its scanner should be easy
to use (concurrently) on strings, not streams. This is not a place
where Flex shines, and anyway, these are examples of Bison, not Flex.
There's already lexcalc and reccalc that demonstrate the use of Flex.
* examples/c/bistromathic/scan.l: Remove.
* examples/c/bistromathic/parse.y (yylex): New.
Adjust dependencies.
The bistromathic example should not use Flex, it makes it too complex.
But it was the only example to show location tracking with Flex.
* examples/c/lexcalc/lexcalc.test, examples/c/lexcalc/parse.y,
* examples/c/lexcalc/scan.l: Demonstrate location tracking as is done
in bistromathic.
* data/skeletons/lalr1.java (Context): Make data members private.
(Context.getLocation): New.
* examples/java/calc/Calc.y, tests/java.at, tests/local.at: Adjust.
* data/skeletons/lalr1.java (yyexpectedTokens)
(yysyntaxErrorArguments): Make them methods of Context.
(Context.yysymbolName): New.
* tests/local.at: Adjust.
In Java there is no need for N_ and yytranslate_. So instead of
hard-coding the use of N_ in the table of the symbol names, rely on
b4_symbol_translate.
* src/output.c (prepare_symbol_names): Use b4_symbol_translate instead
of N_.
* data/skeletons/c.m4 (b4_symbol_translate): New.
* data/skeletons/lalr1.java (yysymbolName): New.
Use it.
* examples/java/calc/Calc.y: Use parse.error=detailed.
* tests/calc.at: Check parse.error=detailed.
* examples/java/calc/Calc.y: The StreamTokenizer cannot "peek" for the
next character, it reads it, and keeps it for the next call. So the
current location is one passed the end of the current token. To avoid
this, keep the previous position, and use it to end the current token.
* examples/java/calc/Calc.test: Adjust.
Currently, if we have long rules and series of shift, we stack states
without showing stack. Let's be more incremental, and do how the Java
skeleton does.
* data/skeletons/lalr1.cc, data/skeletons/lalr1.d,
* data/skeletons/yacc.c:
Here.
Adjust test cases.
* tests/torture.at (AT_DATA_STACK_TORTURE): Disable stack traces: this
test produces a very large stack, and showing the stack each time we
shift a token goes quadatric.
This example, so far, was tracking the current token number, not the
current column number. This is not nice for an example...
* examples/java/Calc.y (PositionReader): New.
Use it.
* examples/java/Calc.test: Check the output.
AFAICT, autoboxing/unboxing was added in Java 5 (September 30, 2004).
I think we can afford to use it. It should help us merge some Java
tests with the main ones.
However, beware that != does not unbox: it compares the object
addresses.
* examples/java/Calc.y, tests/java.at: Simplify.
* examples/java/Calc.test, tests/java.at: Improve tests.
Add an example to demonstrate the use of push parser. I'm pleasantly
surprised: parse.error=detailed works like a charm with push parsers.
* examples/c/local.mk, examples/c/pushcalc/Makefile
* examples/c/pushcalc/README.md, examples/c/pushcalc/calc.test,
* examples/c/pushcalc/calc.y, examples/c/pushcalc/local.mk:
New.
* examples/c/calc/calc.y: Restore to its original state, with
parse.error=detailed instead of parse.error=custom (this example
should be simple).
* examples/c/calc/calc.test: Check syntax errors.
* examples/c/lexcalc/parse.y: Add comments.
Provide users with a means to query for the currently allowed tokens.
Could be used for autocompletion for instance.
* data/skeletons/yacc.c (yyexpected_tokens): New, extracted from
yysyntax_error_arguments.
* examples/c/calc/calc.y (PRINT_EXPECTED_TOKENS): New.
Use it.
When parse.error is custom, let users define a yyreport_syntax_error
function, and use it.
* data/skeletons/bison.m4 (b4_error_verbose_if): Accept 'custom'.
* data/skeletons/yacc.c: Implement it.
* examples/c/calc/calc.y: Experiment with it.
Reported by Thomas Petazzoni.
https://lists.gnu.org/archive/html/bug-bison/2019-08/msg00000.html
* examples/c/reccalc/local.mk: Complete dependencies, including for
earlier versions of Automake (for sake of our CI, on top of Ubuntu
Xenial/Bionic, which feature only Automake 1.15).
(%D%/scan.c %D%/scan.h): Upgrade to the full version provided in
Automake's documentation.
This patch contains more fixes to prefer signed to unsigned
integer types, as modern tools like 'gcc -fsanitize=undefined'
can check for signed integer overflow but not unsigned overflow.
* NEWS: Document the API change.
* boostrap.conf (gnulib_modules): Add intprops.
* data/skeletons/glr.c: Include stddef.h and stdint.h,
since this skeleton can assume C99 or later.
(YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX.
(yybool) [!__cplusplus]: Now signed (which is how bool behaves).
(YYTRANSLATE): Avoid use of unsigned, and make the macro
safe even for values greater than UINT_MAX.
(yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack)
(yyaddDeferredAction, yyinitStateSet, yyinitGLRStack)
(yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes)
(yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction)
(yyglrReduce, yysplitStack, yyreportTree, yycompressStack)
(yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError)
(yyparse, yy_yypstack, yypstack, yypdumpstack):
* tests/input.at (Torturing the Scanner):
Prefer ptrdiff_t to size_t.
* data/skeletons/c++.m4 (b4_yytranslate_define):
* src/AnnotationList.c (AnnotationList__computePredecessorAnnotations):
* src/AnnotationList.h (AnnotationIndex):
* src/InadequacyList.h (InadequacyListNodeCount):
* src/closure.c (closure_new):
* src/complain.c (error_message, complains, complain_indent)
(complain_args, duplicate_directive, duplicate_rule_directive):
* src/gram.c (nritems, ritem_print, grammar_dump):
* src/ielr.c (ielr_compute_ritem_sees_lookahead_set)
(ielr_item_has_lookahead, ielr_compute_annotation_lists)
(ielr_compute_lookaheads):
* src/location.c (columns, boundary_print, location_print):
* src/muscle-tab.c (muscle_percent_define_insert)
(muscle_percent_define_check_values):
* src/output.c (prepare_rules, prepare_actions):
* src/parse-gram.y (id, handle_require):
* src/reader.c (record_merge_function_type, packgram):
* src/reduce.c (nuseless_productions, nuseless_nonterminals)
(inaccessable_symbols):
* src/relation.c (relation_print):
* src/scan-code.l (variant, variant_table_size, variant_count)
(variant_add, get_at_spec, show_sub_message, show_sub_messages)
(parse_ref):
* src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>)
(scan_integer, convert_ucn_to_byte, handle_syncline):
* src/scan-skel.l (at_complain):
* src/symtab.c (complain_symbol_redeclared)
(complain_semantic_type_redeclared, complain_class_redeclared)
(symbol_class_set, complain_user_token_number_redeclared):
* src/tables.c (conflict_tos, conflrow, conflict_table)
(conflict_list, save_row, pack_vector):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Prefer signed to unsigned integer.
* data/skeletons/lalr1.cc (yy_lac_check_):
* tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR):
* tests/local.at (AT_YYLEX_DEFINE(c)):
Omit now-unnecessary casts.
* data/skeletons/location.cc (b4_location_define):
* doc/bison.texi (Mfcalc Lexer, C++ position, C++ location):
Prefer int to unsigned for line and column numbers.
Change example to abort explicitly on memory exhaustion,
and fix an off-by-one bug that led to undefined behavior.
* data/skeletons/stack.hh (stack::operator[]):
Also allow ptrdiff_t indexes.
(stack::pop, slice::slice, slice::operator[]):
Index arg is now ptrdiff_t, not int.
(stack::ssize): New method.
(slice::range_): Now ptrdiff_t, not int.
* data/skeletons/yacc.c (b4_state_num_type): Remove.
All uses replaced by b4_int_type.
(YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros.
(yylac, yyparse): Use them around conversions that -Wconversion
would give false alarms about. Omit unnecessary casts.
(yy_stack_print): Use int rather than unsigned, and omit
a cast that doesn’t seem to be needed here any more.
* examples/c++/variant.yy (yylex):
* examples/c++/variant-11.yy (yylex):
Omit no-longer-needed conversions to unsigned.
* src/InadequacyList.c (InadequacyList__new_conflict):
Don’t assume *node_count is unsigned.
* src/output.c (muscle_insert_unsigned_table):
Remove; no longer used.
* data/skeletons/lalr1.java: Use more conventional function names for
Java.
Prefer < and <= to => and >.
Use the same approach for m4 quotation as in the other skeletons.
Fix indentation issues.
* tests/calc.at, tests/java.at, tests/javapush.at: Fix quotation style.
(main): Use 'args', not 'argv', the former seems more conventional and
is used elsewhere in Bison.
Prefer character literals to integers to denote characters.
* examples/java/Calc.y: Likewise.
It works only in implicit rules (or with GNU Make, but not with
Solaris Make).
Reported by Bruno Haible.
http://lists.gnu.org/archive/html/bug-bison/2019-05/msg00009.html
Diagnosed thanks to Kiyoshi Kanazawa.
* examples/c/reccalc/local.mk, examples/d/local.mk,
* examples/java/local.mk: Don't use $< in non implicit rules.
Because some of our examples use
%C%_reccalc_SOURCES = %D%/parse.y
Automake ships parse.y and parse.c, and possibly parse.h when it
"understands" that there is one. This is not what we want: ship only
parser.y. Yet we still want to use Automake to compile the sources
from parser.y. The easiest seems to use
nodist_%C%_reccalc_SOURCES = %D%/parse.y
together with
dist_reccalc_DATA = %D%/parse.y %D%/scan.l %D%/Makefile %D%/README.md
which guarantees that parse.y is indeed shipped.
* examples/c/calc/local.mk, examples/c/lexcalc/local.mk,
* examples/c/reccalc/local.mk: Always use nodist_*SOURCES for parsers,
let the dist_*_DATA rules do their job.