bison

mirror of https://git.savannah.gnu.org/git/bison.git synced 2026-03-09 20:33:03 +00:00

Author	SHA1	Message	Date
Akim Demaille	3911aba39a	%merge: associate it to its first definition, not the latest Currently each time we meet %merge we record this location as the defining location (and symbol). Instead, record the first definition. In the generated code we go from yy0->A = merge (yy0, yy1); to yy0->S = merge (yy0, yy1); where S was indeed the first symbol, and in the diagnostics we go from glr-regr18.y:30.18-24: error: result type clash on merge function 'merge': <type2> != <type1> 30 \| sym2: sym3 %merge<merge> { $$ = $1; } ; \| ^~~~~~~ glr-regr18.y:29.18-24: note: previous declaration 29 \| sym1: sym2 %merge<merge> { $$ = $1; } ; \| ^~~~~~~ glr-regr18.y:31.13-19: error: result type clash on merge function 'merge': <type3> != <type2> 31 \| sym3: %merge<merge> { $$ = 0; } ; \| ^~~~~~~ glr-regr18.y:30.18-24: note: previous declaration 30 \| sym2: sym3 %merge<merge> { $$ = $1; } ; \| ^~~~~~~ to glr-regr18.y:30.18-24: error: result type clash on merge function 'merge': <type2> != <type1> 30 \| sym2: sym3 %merge<merge> { $$ = $1; } ; \| ^~~~~~~ glr-regr18.y:29.18-24: note: previous declaration 29 \| sym1: sym2 %merge<merge> { $$ = $1; } ; \| ^~~~~~~ glr-regr18.y:31.13-19: error: result type clash on merge function 'merge': <type3> != <type1> 31 \| sym3: %merge<merge> { $$ = 0; } ; \| ^~~~~~~ glr-regr18.y:29.18-24: note: previous declaration 29 \| sym1: sym2 %merge<merge> { $$ = $1; } ; \| ^~~~~~~ where both duplicates are reported against definition 1, rather than using definition 1 as a reference when diagnosing about definition 2, and then 2 as a reference for 3. * src/reader.c (record_merge_function_type): Keep the first definition. * tests/glr-regression.at: Adjust.	2020-12-31 08:07:34 +01:00
Akim Demaille	ac3d5b76f7	%merge: let mergers record a typing-symbol, rather than a type Symbols are richer than types, and in M4 it is my simpler (and more common) to deal with symbols rather than types. So let's associate mergers to a symbol rather than a type name. * src/reader.h (merger_list): Replace the 'type' member by a symbol member. * src/reader.c (record_merge_function_type): Take a symbol as argument, rather than a type name. * src/output.c (merger_output): Adjust.	2020-12-31 08:07:11 +01:00
Akim Demaille	c0f3b55b25	style: address syntax-check diagnostics * examples/c/glr/c++-types.y: Formatting changes. * po/POTFILES.in: Add missing files. * src/reader.c: Remove useless include. * tests/calc.at: Avoid magic values for exit. Obfuscate calls to error.	2020-12-21 07:51:02 +01:00
Akim Demaille	0e78a9028e	portability: beware of GCC 4.6 src/reader.c: In function 'grammar_start_symbols_add': src/reader.c:67:24: error: declaration of 'dup' shadows a global declaration [-Werror=shadow] * src/reader.c (grammar_start_symbols_add): Rename dup as dupl.	2020-12-03 19:46:20 +01:00
Akim Demaille	5b19f91ccf	multistart: check duplicates * src/symlist.h, src/symlist.c (symbol_list_find_symbol) (symbol_list_last): New. (symbol_list_append): Use symbol_list_last. * src/reader.c (grammar_start_symbols_add): Check and discard duplicates. * tests/input.at (Duplicate %start symbol): New. * tests/reduce.at (Bad start symbols): Add the multistart keyword.	2020-11-30 16:48:03 +01:00
Akim Demaille	d798851e48	style: rename grammar_start_symbols_set as grammar_start_symbols_add * src/reader.h, src/reader.c (grammar_start_symbols_set): Rename as... (grammar_start_symbols_add): this. Adjust dependencies.	2020-11-22 11:18:20 +01:00
Akim Demaille	683040b324	multistart: allow tokens as start symbols After all, why not? * src/reader.c (switching_token): Use symbol_id_get. (check_start_symbols): Require that the start symbol is a token only if it's the only one. * examples/c/lexcalc/parse.y: Let NUM be a start symbol.	2020-09-27 09:44:23 +02:00
Akim Demaille	d9cf99b6a5	multistart: use b4_accept instead of action post-processing For each start symbol, generate a parsing function with a richer return value than the usual of yyparse. Reserve a place for the returned semantic value, in order to avoid having to pass a pointer as argument to "return" that value. This also makes the call to the parsing function independent of whether a given start-symbol is typed. For instance, if the grammar file contains: %type <int> expression %start input expression (so "input" is valueless) we get typedef struct { int yystatus; } yyparse_input_t; yyparse_input_t yyparse_input (void); typedef struct { int yyvalue; int yystatus; } yyparse_expression_t; yyparse_expression_t yyparse_expression (void); This commit also changes the implementation of the parser termination: when there are multiple start symbols, it is the initial rules that explicitly YYACCEPT. They do that after having exported the start-symbol's value (if it is typed): switch (yyn) { case 1: /* $accept: YY_EXPRESSION expression $end / { ((yyvalue).TOK_expression) = (yyvsp[-1].TOK_expression); YYACCEPT; } break; case 2: /* $accept: YY_INPUT input $end / { YYACCEPT; } break; I have tried several ways to deal with termination, and this is the one that appears the best one to me. It is also the most natural. src/scan-code.h, src/scan-code.l (obstack_for_actions): New. * src/reader.c (grammar_rule_check_and_complete): Generate the actions of the rules for each start symbol. * data/skeletons/bison.m4 (b4_symbol_slot): New, with safer semantics than type and type_tag. * data/skeletons/yacc.c (b4_accept): New. Generates the body of the action of the start rules. (_b4_declare_sub_yyparse): For each start symbol define a dedicated return type for its parsing function. Adjust the declaration of its parsing function. (_b4_define_sub_yyparse): Adjust the definition of the function. * examples/c/lexcalc/parse.y: Check the case of valueless symbols. * examples/c/lexcalc/lexcalc.test: Check start symbols.	2020-09-27 09:44:18 +02:00
Akim Demaille	a6805bb8d9	multistart: adjust reader checks for generated rules So far we were not checking the generated rule 0 at all. Now there can be several of them. Instead of not checking at all, let's be more selective on the check to run on them. * src/reader.c (grammar_rule_check_and_complete): Don't check for value usage for generated rules, it is ok to have a valued start symbol, in which case it is ok for the generated rule ("accept: start $end {}") to not use $1. (packgram): Call grammar_rule_check_and_complete for all the rules.	2020-09-27 09:23:51 +02:00
Akim Demaille	8eaddf326b	multistart: turn start symbols into rules on $accept Now that the parser can read several start symbols, let's process them, and create the corresponding rules. * src/parse-gram.y (grammar_declaration): Accept a list of start symbols. * src/reader.h, src/reader.c (grammar_start_symbol_set): Rename as... (grammar_start_symbols_set): this. * src/reader.h, src/reader.c (start_flag): Replace with... (start_symbols): this. * src/reader.c (grammar_start_symbols_set): Build a list of start symbols. (switching_token, create_start_rules): New. (check_and_convert_grammar): Use them to turn the list of start symbols into a set of rules. * src/reduce.c (nonterminals_reduce): Don't complain about $accept, it's an internal detail. (reduce_grammar): Complain about all the start symbols that don't derive sentences. * src/symtab.c (startsymbol, startsymbol_loc): Remove, replaced by start_symbols. symbols_pack): Move the check about the start symbols to... * src/symlist.c (check_start_symbols): here. Adjust to multiple start symbols. * tests/reduce.at (Empty Language): Generalize into... (Bad start symbols): this.	2020-09-27 09:23:51 +02:00
Akim Demaille	e50ec28153	reader: get ready to create several initial rules * src/reader.c (create_start_rule): New. Use it.	2020-09-27 09:23:50 +02:00
Akim Demaille	0711dca9d9	add support for --html * bootstrap.conf: We need the "execute" module. * src/files.h, src/files.c (spec_html_file, html_flag): New. * src/getargs.h, src/getargs.c (--html): New. * src/print-xml.h, src/print-xml.c (print_html): New. * src/main.c: Use them. * tests/output.at, tests/report.at: Check --html.	2020-09-19 17:49:03 +02:00
Valentin Tolmer	ef09bf065a	glr2.cc: fork glr.cc to a c++ version This is a fork of glr.cc to be c++-first instead of a wrapper around glr.c. * data/skeletons/glr2.cc: New. * data/skeletons/bison.m4, data/skeletons/c++.m4: Adjust. * data/skeletons/c.m4 (b4_user_args_no_comma): New. * src/reader.c (grammar_rule_check_and_complete): glr2.cc is C++. * tests/actions.at, tests/c++.at, tests/calc.at, tests/conflicts.at, * tests/input.at, tests/local.at, tests/regression.at, tests/scanner.at, * tests/synclines.at, tests/types.at: Also check glr2.cc.	2020-08-30 10:45:21 +02:00
Akim Demaille	b7aab2dbad	fix: crash when redefining the EOF token Reported by Agency for Defense Development. https://lists.gnu.org/r/bug-bison/2020-08/msg00008.html On an empty such as %token FOO BAR FOO 0 %% input: %empty we crash because when we find FOO 0, we decrement ntokens (since FOO was discovered to be EOF, which is already known to be a token, so we increment ntokens for it, and need to cancel this). This "works well" when EOF is properly defined in one go, but here it is first defined and later only assign token code 0. In the meanwhile BAR was given the token number that we just decremented. To fix this, assign symbol numbers after parsing, not during parsing, so that we also saw all the explicit token codes. To maintain the current numbers (I'd like to keep no difference in the output, not just equivalence), we need to make sure the symbols are numbered in the same order: that of appearance in the source file. So we need the locations to be correct, which was almost the case, except for nterms that appeared several times as LHS (i.e., several times as "foo: ..."). Fixing the use of location_of_lhs sufficed (it appears it was intended for this use, but its implementation was unfinished: it was always set to "false" only). * src/symtab.c (symbol_location_as_lhs_set): Update location_of_lhs. (symbol_code_set): Remove broken hack that decremented ntokens. (symbol_class_set, dummy_symbol_get): Don't set number, ntokens and nnterms. (symbol_check_defined): Do it. (symbols): Don't count nsyms here. Actually, don't count nsyms at all: let it be done in... * src/reader.c (check_and_convert_grammar): here. Define nsyms from ntokens and nnterms after parsing. * tests/input.at (EOF redeclared): New. * examples/c/bistromathic/bistromathic.test: Adjust the traces: in "%nterm <double> exp %% input: ...", exp used to be numbered before input.	2020-08-07 07:30:06 +02:00
Akim Demaille	89e42ffb4b	style: fix missing space before paren * cfg.mk (_space_before_paren_exempt): Be less laxist. * src/output.c, src/reader.c: Fix space before paren issues. Pacify the warnings where applicable.	2020-08-07 07:30:06 +02:00
Maarten De Braekeleer	ad6f600bb1	portability: rename accept to acceptsymbol because of MSVC MSVC already defines this symbol. * src/symtab.h, src/symtab.c (accept): Rename as... (acceptsymbol): this. Adjust dependencies.	2020-08-02 08:32:57 +02:00
Akim Demaille	0820f16ca8	style: update comments * src/reader.c: action_obstack was removed in 2002... * src/parse-gram.y: Better names. * src/scan-code.h: More comments.	2020-07-05 09:59:45 +02:00
Akim Demaille	0e5cbd38b2	style: shift/reduce, not shift-reduce * src/reader.c: here.	2020-06-28 08:33:24 +02:00
Akim Demaille	feb0bb0a59	style: rename endtoken as eoftoken * src/symtab.h, src/symtab.c (endtoken): Rename as... (eoftoken): this. Adjust dependencies.	2020-06-27 17:31:59 +02:00
Akim Demaille	0895858d8e	style: use 'nonterminal' consistently * doc/bison.texi: Formatting changes. * src/gram.h, src/gram.c (nvars): Rename as... (nnterms): this. Adjust dependencies. (section): New. Use it. Replace "non terminal" and "non-terminal" by "nonterminal".	2020-06-27 11:39:32 +02:00
Akim Demaille	5855da4722	parser: keep string aliases as the user wrote it Currently our scanner decodes all the escapes in the strings, and we later reescape the strings when we emit them. This is troublesome, as we do not respect the user input. For instance, when the user writes in UTF-8, we destroy her string when we write it back. And this shows everywhere: in the reports we show the escaped string instead of the actual alias: 0 $accept: . exp $end 1 exp: . exp "\342\212\225" exp 2 \| . exp "+" exp 3 \| . exp "+" exp 4 \| . "number" 5 \| . "\303\221\303\271\341\271\203\303\251\342\204\235\303\264" "number" shift, and go to state 1 "\303\221\303\271\341\271\203\303\251\342\204\235\303\264" shift, and go to state 2 This commit preserves the user's exact spelling of the string aliases, instead of interpreting the escapes and then reescaping. The report now shows: 0 $accept: . exp $end 1 exp: . exp "⊕" exp 2 \| . exp "+" exp 3 \| . exp "+" exp 4 \| . "number" 5 \| . "Ñùṃéℝô" "number" shift, and go to state 1 "Ñùṃéℝô" shift, and go to state 2 Likewise, the XML (and therefore HTML) outputs are fixed. * src/scan-gram.l (STRING, TSTRING): Do not interpret the escapes in the resulting string. * src/parse-gram.y (unquote, parser_init, parser_free, unquote_free) (handle_defines, handle_language, obstack_for_unquote): New. Use them to unquote where needed. * tests/regression.at, tests/report.at: Update.	2020-06-13 16:56:40 +02:00
Akim Demaille	e7aff57122	style: rename user_token_number as code This should have been done in 3.6, but I wanted to avoid introducing conflicts into Vincent's work on counterexamples. It turns out it's completely orthogonal. * data/README.md, data/skeletons/bison.m4, data/skeletons/c++.m4, * data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/java.m4, * data/skeletons/lalr1.d, data/skeletons/lalr1.java, * data/skeletons/variant.hh, data/skeletons/yacc.c, src/conflicts.c, * src/derives.c, src/gram.c, src/gram.h, src/output.c, * src/parse-gram.c, src/parse-gram.y, src/print-xml.c, src/print.c, * src/reader.c, src/symtab.c, src/symtab.h, tests/input.at, * tests/types.at: s/user_token_number/code/g. Plus minor changes.	2020-05-23 08:43:58 +02:00
Akim Demaille	e50de09886	tokens: properly define the YYEOF token kind Currently EOF is handled in an adhoc way, with a #define YYEOF 0 in the implementation file. As a result, the user has to define her own EOF token if she wants to use it, which is a pity. Give the $end token a visible kind name, YYEOF. Except that in C, where enums are not scoped, we would have collisions between all the definitions of YYEOFs in the header files, so in C, make it <api.PREFIX>EOF. * data/skeletons/c.m4 (YYEOF): Override its name to avoid collisions. Unless the user already gave it a different name. * data/skeletons/glr.c (YYEOF): Remove. Use ]b4_symbol(0, [id])[ instead. Add support for "pre_epilogue", for glr.cc. * data/skeletons/glr.cc: Remove dead code (never emitted #undefs). * data/skeletons/yacc.c * src/parse-gram.c * src/reader.c * src/symtab.c * tests/actions.at * tests/input.at	2020-04-12 13:56:44 +02:00
Akim Demaille	cc68bbf799	bison: use consistently "token kind", not "token type" * src/output.c, src/reader.c, src/scan-gram.l, src/tables.c: here.	2020-04-05 19:14:39 +02:00
Akim Demaille	296660304c	style: comment changes * src/symtab.h, src/lr0.c: here.	2020-02-23 08:25:53 +01:00
Victor Morales Cayuela	e09a72eeb0	diagnostics: modernize the display of submessages Since Bison 2.7, output was indented four spaces for explanatory statements. For example: input.y:2.7-13: error: %type redeclaration for exp input.y:1.7-11: previous declaration Since the introduction of caret-diagnostics, it became less clear. Remove the indentation and display submessages as in GCC: input.y:2.7-13: error: %type redeclaration for exp 2 \| %type <float> exp \| ^~~~~~~ input.y:1.7-11: note: previous declaration 1 \| %type <int> exp \| ^~~~~ * src/complain.h (SUB_INDENT): Remove. (warnings): Add "note" to the enum. * src/complain.h, src/complain.c (complain_indent): Replace by... (subcomplain): this. Adjust all dependencies. * tests/actions.at, tests/diagnostics.at, tests/glr-regression.at, * tests/input.at, tests/named-refs.at, tests/regression.at: Adjust expectations.	2020-02-15 08:28:40 +01:00
Akim Demaille	8036635251	package: bump copyrights to 2020 Run 'make update-copyright'.	2020-01-05 10:26:35 +01:00
Akim Demaille	28d1ca8f48	diagnostics: yacc reserves %type to nonterminals On %token TOKEN1 %type <ival> TOKEN1 TOKEN2 't' %token TOKEN2 %% expr: bison -Wyacc gives input.y:2.15-20: warning: POSIX yacc reserves %type to nonterminals [-Wyacc] 2 \| %type <ival> TOKEN1 TOKEN2 't' \| ^~~~~~ input.y:2.29-31: warning: POSIX yacc reserves %type to nonterminals [-Wyacc] 2 \| %type <ival> TOKEN1 TOKEN2 't' \| ^~~ input.y:2.22-27: warning: POSIX yacc reserves %type to nonterminals [-Wyacc] 2 \| %type <ival> TOKEN1 TOKEN2 't' \| ^~~~~~ The messages appear to be out of order, but they are emitted when the error is found. * src/symtab.h (symbol_class): Add pct_type_sym, used to denote symbols appearing in %type. * src/symtab.c (complain_pct_type_on_token): New. (symbol_class_set): Check that %type is not applied to tokens. (symbol_check_defined): pct_type_sym also means undefined. * src/parse-gram.y (symbol_decl.1): Set the class to pct_type_sym. * src/reader.c (grammar_current_rule_begin): pct_type_sym also means undefined. * tests/input.at (Yacc's %type): New.	2019-11-17 09:45:25 +01:00
Akim Demaille	8228d96d33	reader: reduce the "scope" of global variables We have too many global variables, adding structure would help. For a start, let's hide some of the variables closer to their usage. * src/getargs.c, src/files.h (current_file): Move to... * src/scan-gram.c: here. * src/scan-gram.h (gram_in, gram__flex_debug): Remove, make them private to the scanner. * src/reader.h, src/reader.c (reader): Take a grammar file as argument. Move the handling of scanner variables to... * src/scan-gram.l (gram_scanner_open, gram_scanner_close): here. (gram_scanner_initialize): Remove, replaced by gram_scanner_open. * src/main.c: Adjust.	2019-10-26 10:39:01 +02:00
Akim Demaille	6e7d8ba6a7	reader: let symtab deal with the symbols * src/reader.c (reader): Move the setting up of the builtin symbols to... * src/symtab.c (symbols_new): here.	2019-10-25 07:48:07 +02:00
Yuichiro Kaneko	3945beb1d2	style: update comment in reader.c rrhs and rlhs were removed by `b2ed6e5826`. * src/reader.c (packgram): Update comment.	2019-10-23 08:32:06 +02:00
Akim Demaille	9e6c5328d3	diagnostics: also show suggested %empty * src/reader.c (grammar_rule_check_and_complete): Suggest to add %empty. * tests/actions.at, tests/diagnostics.at: Adjust expectations.	2019-10-06 12:15:12 +02:00
Paul Eggert	133edcd248	Prefer signed to unsigned integers This patch contains more fixes to prefer signed to unsigned integer types, as modern tools like 'gcc -fsanitize=undefined' can check for signed integer overflow but not unsigned overflow. * NEWS: Document the API change. * boostrap.conf (gnulib_modules): Add intprops. * data/skeletons/glr.c: Include stddef.h and stdint.h, since this skeleton can assume C99 or later. (YYSIZEMAX): Now signed, and the minimum of SIZE_MAX and PTRDIFF_MAX. (yybool) [!__cplusplus]: Now signed (which is how bool behaves). (YYTRANSLATE): Avoid use of unsigned, and make the macro safe even for values greater than UINT_MAX. (yytnamerr, struct yyGLRState, struct yyGLRStateSet, struct yyGLRStack) (yyaddDeferredAction, yyinitStateSet, yyinitGLRStack) (yyexpandGLRStack, yymarkStackDeleted, yyremoveDeletes) (yyglrShift, yyglrShiftDefer, yy_reduce_print, yydoAction) (yyglrReduce, yysplitStack, yyreportTree, yycompressStack) (yyprocessOneStack, yyreportSyntaxError, yyrecoverSyntaxError) (yyparse, yy_yypstack, yypstack, yypdumpstack): * tests/input.at (Torturing the Scanner): Prefer ptrdiff_t to size_t. * data/skeletons/c++.m4 (b4_yytranslate_define): * src/AnnotationList.c (AnnotationList__computePredecessorAnnotations): * src/AnnotationList.h (AnnotationIndex): * src/InadequacyList.h (InadequacyListNodeCount): * src/closure.c (closure_new): * src/complain.c (error_message, complains, complain_indent) (complain_args, duplicate_directive, duplicate_rule_directive): * src/gram.c (nritems, ritem_print, grammar_dump): * src/ielr.c (ielr_compute_ritem_sees_lookahead_set) (ielr_item_has_lookahead, ielr_compute_annotation_lists) (ielr_compute_lookaheads): * src/location.c (columns, boundary_print, location_print): * src/muscle-tab.c (muscle_percent_define_insert) (muscle_percent_define_check_values): * src/output.c (prepare_rules, prepare_actions): * src/parse-gram.y (id, handle_require): * src/reader.c (record_merge_function_type, packgram): * src/reduce.c (nuseless_productions, nuseless_nonterminals) (inaccessable_symbols): * src/relation.c (relation_print): * src/scan-code.l (variant, variant_table_size, variant_count) (variant_add, get_at_spec, show_sub_message, show_sub_messages) (parse_ref): * src/scan-gram.l (<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>) (scan_integer, convert_ucn_to_byte, handle_syncline): * src/scan-skel.l (at_complain): * src/symtab.c (complain_symbol_redeclared) (complain_semantic_type_redeclared, complain_class_redeclared) (symbol_class_set, complain_user_token_number_redeclared): * src/tables.c (conflict_tos, conflrow, conflict_table) (conflict_list, save_row, pack_vector): * tests/local.at (AT_YYLEX_DEFINE(c)): Prefer signed to unsigned integer. * data/skeletons/lalr1.cc (yy_lac_check_): * tests/actions.at (_AT_CHECK_PRINTER_AND_DESTRUCTOR): * tests/local.at (AT_YYLEX_DEFINE(c)): Omit now-unnecessary casts. * data/skeletons/location.cc (b4_location_define): * doc/bison.texi (Mfcalc Lexer, C++ position, C++ location): Prefer int to unsigned for line and column numbers. Change example to abort explicitly on memory exhaustion, and fix an off-by-one bug that led to undefined behavior. * data/skeletons/stack.hh (stack::operator[]): Also allow ptrdiff_t indexes. (stack::pop, slice::slice, slice::operator[]): Index arg is now ptrdiff_t, not int. (stack::ssize): New method. (slice::range_): Now ptrdiff_t, not int. * data/skeletons/yacc.c (b4_state_num_type): Remove. All uses replaced by b4_int_type. (YY_CONVERT_INT_BEGIN, YY_CONVERT_INT_END): New macros. (yylac, yyparse): Use them around conversions that -Wconversion would give false alarms about. Omit unnecessary casts. (yy_stack_print): Use int rather than unsigned, and omit a cast that doesn’t seem to be needed here any more. * examples/c++/variant.yy (yylex): * examples/c++/variant-11.yy (yylex): Omit no-longer-needed conversions to unsigned. * src/InadequacyList.c (InadequacyList__new_conflict): Don’t assume node_count is unsigned. src/output.c (muscle_insert_unsigned_table): Remove; no longer used.	2019-10-02 17:11:33 -07:00
Akim Demaille	8c06cb9130	fixits: be sure to preserve the action when adding %empty Currently we remove the rhs to install %empty instead. * src/reader.c (grammar_rule_check_and_complete): Insert the missing %empty in front of the rhs, not in replacement thereof. * tests/actions.at (Add missing %empty): Check that.	2019-05-03 16:28:28 +02:00
Akim Demaille	013720f0e7	style: use consistently _loc for locations Some members are called foo_location, others are foo_loc. Stick to the latter. src/gram.h, src/location.h, src/location.c, src/output.c, * src/parse-gram.y, src/reader.h, src/reader.c, src/reduce.c, * src/scan-gram.l, src/symlist.h, src/symlist.c, src/symtab.h, * src/symtab.c: Use _loc consistently, not _location.	2019-05-03 16:28:28 +02:00
Akim Demaille	365b4d95a4	style: clarify the use of symbol_lists' locations symbol_list features a 'location' and a 'sym_loc' member. The former is expected to be set only for symbol_lists that denote a symbol (not a type name), and the latter should only denote the location of the symbol/type name. Yet both are set, and the name "location" is too unprecise. * src/symlist.h, src/symlist.c (symbol_list::location): Rename as rhs_loc for clarity. Move it to the "section" of data valid only for rules. * src/reader.c, src/scan-code.l: Adjust.	2019-05-03 16:28:28 +02:00
Akim Demaille	57290d63fd	package: various fixes for syntax-check * cfg.mk: Disable checks where needed (e.g., we do want to check the behavior with tabs). (sc_at_parser_check): Remove. Unfortunately since `a11c144609` we no longer use the './' prefix to run programs in the current directory. That was so that we could run Java programs like the other, although they are no run with the `./` prefix (see `967a59d2c0`). As a consequence this sc check no longer makes sense. However, since now AT_PARSER_CHECK passes the `./` prefix itself, this sc-check was superfluous. * examples/c/reccalc/scan.l: Use memcpy, not strncpy. * src/ielr.c, src/reader.c: Obfuscate "lr(0)" so that the sc-check for "space before paren" does not fire. * tests/diagnostics.at: Avoid space-tab, use tab-tab.	2019-04-28 08:24:31 +02:00
Akim Demaille	971e72514f	updates: insert/remove %empty * src/reader.c (grammar_rule_check_and_complete): Generate fixits for adding/removing %empty. * tests/actions.at, tests/diagnostics.at, tests/existing.at: Adjust.	2019-04-24 13:21:24 +02:00
Akim Demaille	ae91c3cce3	reader: clarify variable names * src/reader.c (grammar_rule_check_and_complete): When 'p' and 'lhs' are aliases, prefer the latter, for clarity and consistency. (grammar_current_rule_begin): Avoid 'p', current_rule suffices. * src/gram.h, src/gram.c: Comment changes. ptdr# calc.tab.c	2019-03-24 18:40:46 +01:00
Akim Demaille	e346210c03	add LR(0) output This should not be used to generate parsers. My point is actually to facilitate debugging (when tweaking the generation of the LR(0) automaton for instance, not carying -yet- about lookaheads). * src/reader.c (prepare_percent_define_front_end_variables): Add lr(0). * src/conflicts.c (set_conflicts): Be robust to reds not having lookaheads at all. * src/ielr.c (LrType, lr_type_get): Adjust. (ielr): Implement support for LR(0). * src/lalr.c (lalr_free): Don't free LA when it's not computed.	2019-02-05 19:02:09 +01:00
Akim Demaille	9566232422	style: comment and name changes * src/output.c (prepare_symbol_names): here. * src/reader.c: Remove obsolete comment. * src/scan-code.l: Use \|\| for Boolean or.	2019-02-02 17:32:10 +01:00
Akim Demaille	dc654a925c	style: comment changes * src/reader.c, src/scan-code.l: here.	2019-02-02 17:32:04 +01:00
Akim Demaille	2c8fb4d126	style: rename duplicate_directive as duplicate_rule_directive * src/complain.h, src/complain.c: here. Adjust callers.	2019-01-16 07:59:25 +01:00
Akim Demaille	2471733f1a	package: bump copyrights to 2019	2019-01-05 14:58:05 +01:00
Akim Demaille	fdceb6330f	symbols: set tag_seen when assigning a type to symbols * src/reader.h, src/reader.c (tag_seen): Move to... * src/symtab.h, src/symtab.c: here. (symbol_type_set): Set it to true. * src/parse-gram.y: Don't.	2018-12-15 17:41:25 +01:00
Akim Demaille	2b2556b41c	style: reduce scopes * src/conflicts.c, src/reader.c: Minor style changes.	2018-11-21 22:08:47 +01:00
Paul Hilfinger	b34b12c4f9	allow %expect and %expect-rr modifiers on individual rules This change allows one to document (and check) which rules participate in shift/reduce and reduce/reduce conflicts. This is particularly important GLR parsers, where conflicts are a normal occurrence. For example, %glr-parser %expect 1 %% ... argument_list: arguments %expect 1 \| arguments ',' \| %empty ; arguments: expression \| argument_list ',' expression ; ... Looking at the output from -v, one can see that the shift-reduce conflict here is due to the fact that the parser does not know whether to reduce arguments to argument_list until it sees the token AFTER the following ','. By marking the rule with %expect 1 (because there is a conflict in one state), we document the source of the 1 overall shift- reduce conflict. In GLR parsers, we can use %expect-rr in a rule for reduce/reduce conflicts. In this case, we mark each of the conflicting rules. For example, %glr-parser %expect-rr 1 %% stmt: target_list '=' expr ';' \| expr_list ';' ; target_list: target \| target ',' target_list ; target: ID %expect-rr 1 ; expr_list: expr \| expr ',' expr_list ; expr: ID %expect-rr 1 \| ... ; In a statement such as x, y = 3, 4; the parser must reduce x to a target or an expr, but does not know which until it sees the '='. So we notate the two possible reductions to indicate that each conflicts in one rule. See https://lists.gnu.org/archive/html/bison-patches/2013-02/msg00105.html. * doc/bison.texi (Suppressing Conflict Warnings): Document %expect, %expect-rr in grammar rules. * src/conflicts.c (count_state_rr_conflicts): Adjust comment. (rule_has_state_sr_conflicts): New static function. (count_rule_sr_conflicts): New static function. (rule_nast_state_rr_conflicts): New static function. (count_rule_rr_conflicts): New static function. (rule_conflicts_print): New static function. (conflicts_print): Also use rule_conflicts_print to report on individual rules. * src/gram.h (struct rule): Add new fields expected_sr_conflicts, expected_rr_conflicts. * src/reader.c (grammar_midrule_action): Transfer expected_sr_conflicts, expected_rr_conflicts to new rule, and turn off in current_rule. (grammar_current_rule_expect_sr): New function. (grammar_current_rule_expect_rr): New function. (packgram): Transfer expected_sr_conflicts, expected_rr_conflicts to new rule. * src/reader.h (grammar_current_rule_expect_sr): New function. (grammar_current_rule_expect_rr): New function. * src/symlist.c (symbol_list_sym_new): Initialize expected_sr_conflicts, expected_rr_conflicts. * src/symlist.h (struct symbol_list): Add new fields expected_sr_conflicts, expected_rr_conflicts. * tests/conflicts.at: Add tests "%expect in grammar rule not enough", "%expect in grammar rule right.", "%expect in grammar rule too much."	2018-11-21 22:08:47 +01:00
Akim Demaille	03a13ce793	reader: recognize C++ even when it's not lalr1.cc or glr.cc * src/reader.c (grammar_rule_check_and_complete): If a user uses her own skeleton but sets the language to C++, recognize it as C++.	2018-10-17 17:53:51 +02:00
Akim Demaille	e3fdc37049	generate the default action only for C++ This commit adds restrictions to what was done in `01898726e2` [1]. Rici Lake [2] has shown that it's risky to disable the pre-action, at least now. Also, generating the default $$ = $1 action can have bad effects in some cases [3]. The original change [1] was prompted for C++. Let's try it there only, for a start. We could restrict it further to lalr1.cc with variants, but we need to see in the wild how this change behaves. And it is not unreasonable to expect grammar files in C++ to behave better wrt types. See [1] https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00050.html [2] https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00061.html [3] https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00066.html * src/getargs.c: Style changes. * src/reader.c (grammar_rule_check_and_complete): Complete only for C++.	2018-10-16 13:41:09 +02:00
Akim Demaille	01898726e2	generate the default semantic action Currently, in C, the default semantic action is implemented by being always run before running the actual user semantic action. As a consequence, when the user action is run, $$ is already set as $1. In C++ with variants, we don't do that, since we cannot manipulate the semantic value without knowing its exact type. When variants are enabled, the only guarantee is that $$ is default contructed and ready to the used. Some users still would like the default action to be run with variants. Frank Heckenbach's parser in C++17 (http://lists.gnu.org/archive/html/bug-bison/2018-04/msg00011.html) provides this feature, but relying on std::variant's dynamic typing, which we forbid in lalr1.cc. The simplest seems to be actually generating the default semantic action (in all languages/skeletons). This makes the pre-action (that sets $$ to $1) useless. But... maybe some users depend on this, in spite of the comments that clearly warn againt this. So let's not turn this off just yet. * src/reader.c (grammar_rule_check_and_complete): Rename as... (grammar_rule_check_and_complete): this. Install the default semantic action when applicable. * examples/variant-11.yy, examples/variant.yy, tests/calc.at: Exercise the default semantic action, even with variants.	2018-10-14 18:53:21 +02:00

1 2 3 4 5 ...

431 Commits