bison

mirror of https://git.savannah.gnu.org/git/bison.git synced 2026-06-08 16:52:35 +00:00

Author	SHA1	Message	Date
Akim Demaille	00c80bc96c	yacc.c: use yysymbol_type_t instead of int for yytoken Now that we have a proper type for internal symbol numbers, let's use it. More code needs conversion, e.g., printers and destructors, but they are shared with glr.c, which is not ready yet for this change. It will also help us deal with warnings such as (GCC9 on GNU/Linux): input.c: In function 'int yyparse()': input.c:475:37: error: enumeral and non-enumeral type in conditional expression [-Werror=extra] 475 \| (0 <= (YYX) && (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYSYMBOL_YYUNDEF) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ input.c:1024:17: note: in expansion of macro 'YYTRANSLATE' 1024 \| yytoken = YYTRANSLATE (yychar); \| ^~~~~~~~~~~ * data/skeletons/yacc.c (yytranslate, yysymbol_name) (yyparse_context_t, yyexpected_tokens, yypstate_expected_tokens) (yysyntax_error_arguments): Use yysymbol_type_t instead of int.	2020-04-01 08:31:48 +02:00
Akim Demaille	f62f1db298	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	3ba001baac	yacc.c: introduce an enum that defines the symbol's number There's a number of advantage in exposing the symbol (internal) numbers: - custom error messages can use them to decide how to represent a given symbol, or a set of symbols. - we need something similar in uses of yyexpected_tokens. For instance, currently, bistromathic's completion() reads: int ntokens = expected_tokens (line, tokens, YYNTOKENS); [...] for (int i = 0; i < ntokens; ++i) if (tokens[i] == YYTRANSLATE (TOK_VAR)) [...] else if (tokens[i] == YYTRANSLATE (TOK_FUN)) [...] else [...] - now that it's a compile-time expression, we can easily build static tables, switch, etc. - some users depended on the ability to get the token number from a symbol to write test cases for their scanners. But Bison 3.5 removed the table this feature depended upon (a reverse yytranslate). Now they can check against the actual symbol number, without having pay (space and time) a conversion. See https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html, and https://lists.gnu.org/archive/html/bug-bison/2020-03/msg00015.html. - it helps us clearly separate the internal symbol numbers from the external token numbers, whose difference is sometimes blurred in the code when values coincide (e.g. "yychar = yytoken = YYEOF"). - it allows us to get rid of ugly macros with inconsistent names such as YYUNDEFTOK and YYTERROR, and to group related definitions together. - similarly it provides a clean access to the $accept symbol (which proves convenient in a current experimentation of mine with several %start symbols). Let's declare this type as a private type (in the .c file, not the .h one). So it does not need to be influenced by the api prefix. * data/skeletons/bison.m4 (b4_symbol_sid): New. (b4_symbol): Use it. * data/skeletons/c.m4 (b4_symbol_enum, b4_declare_symbol_enum): New. * data/skeletons/yacc.c: Use b4_declare_symbol_enum. (YYUNDEFTOK, YYTERROR): Remove. Use the corresponding symbol enum instead.	2020-04-01 08:31:33 +02:00
Akim Demaille	4140320a0a	style: comment changes about token numbers * data/skeletons/bison.m4, data/skeletons/c.m4: here.	2020-03-30 08:41:12 +02:00
Akim Demaille	af19fd7e0f	tests: recheck: work properly when the test suite was interrupted * tests/local.mk (recheck): Look at the per-test logs, not the overall log, which, when interrupted, contains only information about... the tests that passed.	2020-03-30 08:41:12 +02:00
Akim Demaille	2c74872991	java: move away from _ for internationalization The "_" is becoming a keyword in Java, which causes tons of warnings currently in our test suite. GNU Gettext is now using "i18n" instead of "_" (https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=e89fea36545f27487d9652a13e6a0adbea1117d0). * data/skeletons/java.m4: Use "i18n", not "_". * examples/java/calc/Calc.y, tests/calc.at: Adjust.	2020-03-30 08:03:10 +02:00
Akim Demaille	50517d578c	regen	2020-03-28 15:13:27 +01:00
Akim Demaille	59d820d1ef	c: use YYNOMEM instead of -2 See `84b1972c96`. * data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): New. Use it.	2020-03-28 15:13:27 +01:00
Akim Demaille	90f0500ef8	todo: update * TODO (Token Number): We have to clean this. (Naming conventions, Symbol numbers): New. (Bad styling): Addressed in `e21ff47f5d`.	2020-03-28 15:13:27 +01:00
Akim Demaille	17a9542c4f	regen	2020-03-28 15:13:27 +01:00
Akim Demaille	b7045aa706	java: make yysyntaxErrorArguments a private detail * data/skeletons/lalr1.java (yysyntaxErrorArguments): Move it from the context, to the parser object. Generate only for detailed and verbose error messages. * tests/local.at (AT_YYERROR_DEFINE(java)): Use yyexpectedTokens instead.	2020-03-28 15:13:27 +01:00
Akim Demaille	ee56b6e0f2	skeletons: make yysyntax_error_arguments a private detail We could just "inline yysyntax_error_arguments back" in the routines it was originally extracted from, but I think the code is nicer to read this way. * data/skeletons/glr.c (yysyntax_error_arguments): Generate only for detailed and verbose error messages. * data/skeletons/yacc.c: Likewise. * data/skeletons/lalr1.cc (parser::context::yysyntax_error_arguments): Move as... (parser::yysyntax_error_arguments_): this. And only for detailed and verbose error messages.	2020-03-28 15:13:27 +01:00
Akim Demaille	1edc98f793	lalr1.cc: avoid using yysyntax_error_arguments * data/skeletons/lalr1.cc (context::token): New. * tests/local.at (yyreport_syntax_error): Don't use yysyntax_error_arguments.	2020-03-28 15:13:27 +01:00
Akim Demaille	4192de1f41	bison: avoid using yysyntax_error_arguments * src/parse-gram.y (yyreport_syntax_error): Use yyparse_context_token and yyexpected_tokens.	2020-03-28 15:13:27 +01:00
Akim Demaille	00b0d02955	tests: yacc.c: avoid yysyntax_error_arguments Because glr.c shares the same testing routines, we also need to convert it. * data/skeletons/glr.c (yyparse_context_token): New. * tests/local.at (yyreport_syntax_error): here.	2020-03-28 15:13:27 +01:00
Akim Demaille	1045c8d0ef	examples: don't use yysyntax_error_arguments Suggested by Adrian Vogelsgesang. https://lists.gnu.org/archive/html/bison-patches/2020-02/msg00069.html * data/skeletons/lalr1.java (Context.EMPTY, Context.getToken): New. (Context.yyntokens): Rename as... (Context.NTOKENS): this. Because (i) all the Java coding styles recommend upper case for constants, and (ii) the Java Skeleton exposes Lexer.EOF, not Lexer.YYEOF. * data/skeletons/yacc.c (yyparse_context_token): New. * examples/c/bistromathic/parse.y (yyreport_syntax_error): Don't use yysyntax_error_arguments. * examples/java/calc/Calc.y (yyreportSyntaxError): Likewise.	2020-03-28 15:13:27 +01:00
Akim Demaille	ef8965b5f5	skeletons: fix incorrect type for translatable tokens * data/skeletons/glr.c, data/skeletons/lalr1.c, data/skeletons/yacc.c: Fix confusion between the "translatable" and the "translate" tables.	2020-03-28 15:13:27 +01:00
Akim Demaille	84b1972c96	yacc.c: use negative numbers for errors in auxiliary functions yyparse returns 0, 1, 2 since ages (accept, reject, memory exhausted). Some of our auxiliary functions such as yy_lac and yyreport_syntax_error also need to return error codes and also use 0, 1, 2. Because it uses yy_lac, yyexpected_tokens also needs to return "problem", "memory exhausted", but in case of success, it needs to return the number of tokens, so it cannot use 1 and 2 as error code. Currently it uses -1 and -2, which is later converted into 1 and 2 as yacc.c expects it. Let's simplify this and use consistently -1 and -2 for auxiliary functions that are not exposed (or not yet exposed) to the user. In particular this will save the user from having to convert yyexpected_tokens's -2 into yyreport_syntax_error's 2: both return -1 or -2. * data/skeletons/yacc.c (yy_lac, yyreport_syntax_error) (yy_lac_stack_realloc): Return -1, -2 for errors instead of 1, 2. Adjust callers. * examples/c/bistromathic/parse.y (yyreport_syntax_error): Do take error codes into account. Issue a syntax error message even if we ran out of memory. * src/parse-gram.y, tests/local.at (yyreport_syntax_error): Adjust.	2020-03-23 07:02:36 +01:00
Akim Demaille	1079595b2a	style: reduce length of private constant * data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c (YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as... (YYARGS_MAX): this. * src/parse-gram.y (YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as... (ARGS_MAX): this.	2020-03-23 07:02:34 +01:00
Akim Demaille	e364bcdbc5	doc: c++: promote api.token.raw * doc/bison.texi (Calc++ Parser): Here.	2020-03-23 07:02:32 +01:00
Akim Demaille	5a8db8a739	bench: calc: no need for super long inputs * etc/bench.pl.in ($iterations): Restore initial value, -1, meaning "at least one second". ($calc_input): There is no need to generate 400 lines.	2020-03-22 15:59:22 +01:00
Akim Demaille	5acc29041e	bench: calc: work on a string instead of a file The cost of the file layer is large and makes benchmarks too coarse, as seen for in following example, first with a file, then with a literal string: 0. %skeleton "yacc.c" %define parse.lac full 1. %skeleton "yacc-v1.c" %define nofinal %define parse.lac full 2. %skeleton "yacc-v2.c" %define nofinal %define parse.lac full 3. %skeleton "yacc-v3.c" %define nofinal %define parse.lac full 4. %skeleton "yacc.c" 5. %skeleton "yacc-v1.c" %define nofinal 6. %skeleton "yacc-v2.c" %define nofinal 7. %skeleton "yacc-v3.c" %define nofinal -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- BM_y0 32558 ns 32537 ns 21228 BM_y1 32400 ns 32369 ns 21233 BM_y2 33485 ns 33464 ns 20625 BM_y3 32139 ns 32125 ns 21446 BM_y4 31343 ns 31329 ns 21747 BM_y5 31344 ns 31317 ns 22035 BM_y6 31287 ns 31255 ns 22039 BM_y7 31387 ns 31373 ns 22178 -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- BM_y0 10642 ns 10634 ns 63601 BM_y1 10657 ns 10654 ns 63625 BM_y2 10441 ns 10432 ns 65957 BM_y3 10558 ns 10554 ns 64546 BM_y4 9521 ns 9516 ns 72011 BM_y5 9179 ns 9157 ns 75028 BM_y6 9360 ns 9356 ns 73770 BM_y7 9365 ns 9359 ns 72609 Of course, at the same time it is less realistic: most users read files rather that strings, so it might lead to us to pay attention to costs most people don't see. * etc/bench.pl.in (&calc_input): Output into a file given as argument. Output in C syntax. (&generate_grammar_calc): Use it. Simplify the grammar: remove operators we don't care about. Rewrite the scanner to work on a char* instead of a FILE*.	2020-03-22 15:59:22 +01:00
Akim Demaille	5b0b0a1e08	bench: add a "latest" symlink * etc/bench.pl.in: here.	2020-03-22 15:59:14 +01:00
Akim Demaille	1c694e08cc	bench: use the same prefix in both bench methods * etc/bench.pl.in (&bench_with_timethese): Also use y$i, as in &bench_with_gbenchmark. (&generate_grammar_calc): Don't add a prefix, let the callers do it.	2020-03-22 15:59:13 +01:00
Akim Demaille	4cfb067d93	bench: use a C++-11 compiler See https://github.com/google/benchmark#a-faster-keeprunning-loop. * etc/bench.pl.in ($cxx): Be C++11. (&bench_with_gbenchmark): Adjust.	2020-03-22 15:59:13 +01:00
Akim Demaille	cf60d0a617	bench: create a README file with benches * etc/bench.pl.in (&bench_with_gbenchmark): Here.	2020-03-22 15:59:13 +01:00
Akim Demaille	c0e8489605	bench: calc: add support for google benchmark * etc/bench.pl.in (&compiler): New, extracted from... (&compile): here. Don't link when using gbm. (&calc_input): Don't make massive input for micro benchmarks. (&generate_grammar_calc): When using gbm, use api.prefix to avoid name collisions. Be ready to issue BENCHMARKS instead of a main. (&bench): Rename as... (&bench_with_timethese): this. (&bench_with_gbenchmark): New. (&bench): New. Dispatch on these two.	2020-03-21 18:19:14 +01:00
Akim Demaille	788b1a6858	bench: better error messages on invalid input * etc/bench.pl.in: here.	2020-03-21 18:17:09 +01:00
Akim Demaille	56414791e9	bench: simplify the calc grammar * etc/bench.pl.in (generate_grammar_calc): We don't need global_result etc.	2020-03-21 18:17:02 +01:00
Akim Demaille	675dcf1962	bench: die clearly on incorrect --grammar arguments * etc/bench.pl.in (getopt): here.	2020-03-21 14:52:41 +01:00
Akim Demaille	466fb66578	regen	2020-03-17 19:21:24 +01:00
Akim Demaille	cbb967dbad	yacc.c: style: prefer switch to if * data/skeletons/yacc.c: Prefer switch to decode yy_lac's return value.	2020-03-17 19:21:07 +01:00
Akim Demaille	44ac18d136	yacc.c: yypstate_expected_tokens In push parsers, when asking for the list of expected tokens at some point, it makes no sense to build a yyparse_context_t: the yypstate alone suffices (the only difference being the lookahead). Instead of forcing the user to build a useless shell around yypstate, let's offer yypstate_expected_tokens. See https://lists.gnu.org/r/bison-patches/2020-03/msg00025.html. * data/skeletons/yacc.c (yypstate): Declare earlier, so that we can use it for... (yypstate_expected_tokens): this new function, when in push parsers. Adjust dependencies. * examples/c/bistromathic/parse.y: Simplify: use yypstate_expected_tokens. Style fixes. Reduce scopes (reported by Joel E. Denny).	2020-03-17 19:20:13 +01:00
Akim Demaille	0c3dd3a669	examples: bistromathic: simplify * examples/c/bistromathic/parse.y (expected_tokens): Remove useless "break".	2020-03-09 07:24:33 +01:00
Akim Demaille	951da960e6	merge branch 'maint' * upstream/maint: maint: post-release administrivia version 3.5.3 news: update for 3.5.3 yacc.c: make sure we properly propagated the user's number for error diagnostics: don't crash because of repeated definitions of error style: initialize some struct members diagnostics: beware of zero-width characters diagnostics: be sure to close the styling when lines are too short muscles: fix incorrect decoding of $ code: be robust to reference with invalid tags build: fix typo doc: update recommandation for libtextstyle style: comment changes examples: use consistently the GFDL header for readmes style: remove useless declarations typo: succesful -> successful README: point to tests/bison, and document --trace gnulib: update maint: post-release administrivia	2020-03-08 10:13:16 +01:00
Akim Demaille	15ea35019f	maint: post-release administrivia * NEWS: Add header line for next release. * .prev-version: Record previous version. * cfg.mk (old_NEWS_hash): Auto-update.	2020-03-08 08:50:10 +01:00
Akim Demaille	f49684a577	version 3.5.3 * NEWS: Record release date. v3.5.3	2020-03-08 08:30:41 +01:00
Akim Demaille	044ad1288c	news: update for 3.5.3	2020-03-08 08:17:13 +01:00
Akim Demaille	e3812bb8c3	yacc.c: make sure we properly propagated the user's number for error * data/skeletons/yacc.c (YYERRCODE): Be truthful. * tests/input.at (Redefining the error token): Check that.	2020-03-08 08:10:11 +01:00
Akim Demaille	cfcd823e16	diagnostics: don't crash because of repeated definitions of error According to https://www.unix.com/man-page/POSIX/1posix/yacc/, the user is allowed to specify her user number for the error token: The token error shall be reserved for error handling. The name error can be used in grammar rules. It indicates places where the parser can recover from a syntax error. The default value of error shall be 256. Its value can be changed using a %token declaration. The lexical analyzer should not return the value of error. I think this feature is useless, the user should not have to deal with that. The intend is probably to give the user a means to use 256 if she wants to, but provided "error" cleared the path first by being assigned another number. In the case of Bison, 256 is assigned to "error" at the end if the user did not use it for a token of hers. So this feature is useless. Yet it is valid, and if the user assigns twice a token number to "error", then the second time we want to complain about it and want to show the original definition. At this point, we try to display the built-in definition of "error", whose location is NULL, and we crash. Rather, the location of the first user definition of "error" should become its defining location. Reported byg Ahcheong Lee. https://lists.gnu.org/r/bug-bison/2020-03/msg00007.html * src/symtab.c (symbol_class_set): If this is a declaration and the symbol was not declared yet, keep this as defining location. * tests/input.at (Redefining the error token): New.	2020-03-08 08:10:11 +01:00
Akim Demaille	2f02d9beae	style: initialize some struct members * src/symtab.c (sym_content_new): Initialize all the location members. Not needed by the code, but disturbing values when using a debugger.	2020-03-08 08:10:11 +01:00
Akim Demaille	b638603477	diagnostics: beware of zero-width characters Currenly we rely on (visual) width of the characters to decide where to open and close the styling of the quoted lines. This breaks when we deal with zero-width characters: we cannot just rely on (visual) columns, we need to know whether we are before, inside, or after the highlighted portion. * src/location.c (location_caret): col_end: no longer add 1, "regular" characters have a width of 1, only 0-width characters have 0-width. opened: replace with 'state', a three-valued enum. Don't reopen the style if we already did. * tests/diagnostics.at (Zero-width characters): New.	2020-03-08 08:10:11 +01:00
Akim Demaille	e21ff47f5d	diagnostics: be sure to close the styling when lines are too short bar.y:4.12-17: <error>error:</error> redefining user token number of foo - 4 \| %token foo <error>123 + 4 \| %token foo <error>123</error> \| <error>^~~~~~</error> * src/location.c (location_caret): Be sure to close. * tests/diagnostics.at (Line is too short, and then you die): New.	2020-03-07 10:01:52 +01:00
Akim Demaille	b82b387da9	muscles: fix incorrect decoding of $ Bug introduced in `458171e6df`. https://lists.gnu.org/archive/html/bison-patches/2013-11/msg00009.html Reported by Ahcheong Lee. https://lists.gnu.org/r/bug-bison/2020-03/msg00010.html * src/muscle-tab.c (COMMON_DECODE): "$" is coded as "$][", not "$[][". * tests/input.at ("%define" enum variables): Check that case.	2020-03-07 07:45:10 +01:00
Akim Demaille	641e326303	code: be robust to reference with invalid tags Because we want to support $<a->b>$, we must accept -> in type tags, and reject $<->$, as it is unfinished. Reported by Ahcheong Lee. * src/scan-code.l (yylex): Make sure "tag" does not end with -, since -> does not close the tag. * tests/input.at (Stray $ or @): Check this.	2020-03-06 17:29:26 +01:00
Akimn Demaille	192e9fdf77	build: fix typo * build-aux/cross-options.pl: here.	2020-03-06 08:32:26 +01:00
Akim Demaille	a4a3f08c11	doc: update recommandation for libtextstyle * README: here.	2020-03-06 08:32:18 +01:00
Akim Demaille	666df338a7	style: comment changes * src/symtab.h, src/lr0.c: here.	2020-03-06 08:32:03 +01:00
Akim Demaille	b437b16603	examples: use consistently the GFDL header for readmes * examples/c++/README.md, examples/c++/calc++/README.md, * examples/c/calc/README.md, examples/c/lexcalc/README.md, * examples/c/reccalc/README.md: Prefer the GFDL banner to the GPL one.	2020-03-06 08:31:34 +01:00
Akim Demaille	b493c173c9	style: remove useless declarations * src/reader.h: Don't duplicate what parse-gram.h already exposes. * src/lr0.h: Remove useless include.	2020-03-06 08:30:21 +01:00

1 2 3 4 5 ...

7004 Commits