bison

mirror of https://git.savannah.gnu.org/git/bison.git synced 2026-04-24 18:52:21 +00:00

Author	SHA1	Message	Date
Akim Demaille	cb40f5c624	build: fix syntax-check issues * src/system.h, tests/local.mk: Fix indentation.	2020-04-04 08:04:11 +02:00
Akim Demaille	6c23b012b9	tests: recheck: work properly when the test suite was interrupted * tests/local.mk (recheck): Look at the per-test logs, not the overall log, which, when interrupted, contains only information about... the tests that passed.	2020-04-02 07:32:48 +02:00
Akim Demaille	ef88dfba81	doc: c++: promote api.token.raw * doc/bison.texi (Calc++ Parser): Here.	2020-04-02 07:32:01 +02:00
Akim Demaille	6e89bc0fd2	build: fix compatibility with old compilers GCC 4.2 dies with src/InadequacyList.c: In function 'InadequacyList__new_conflict': src/InadequacyList.c:37: error: #pragma GCC diagnostic not allowed inside functions src/InadequacyList.c:37: error: #pragma GCC diagnostic not allowed inside functions src/InadequacyList.c:40: error: #pragma GCC diagnostic not allowed inside functions Reported by Evan Lavelle. See https://lists.gnu.org/r/bug-bison/2020-03/msg00021.html and https://trac.macports.org/ticket/59927. * src/system.h (GCC_VERSION): New. Use it to control IGNORE_TYPE_LIMITS_BEGIN and IGNORE_TYPE_LIMITS_END.	2020-04-02 07:16:44 +02:00
Akim Demaille	e3e21cc0d8	examples: reccalc: compile cleanly in C99 See https://trac.macports.org/ticket/59927. * examples/c/reccalc/parse.y: C99 does not allow multiple typedefs.	2020-04-02 07:14:19 +02:00
Akim Demaille	a0ee2a7543	c++: replace symbol_number_type with symbol_type_type * data/skeletons/c++.m4, data/skeletons/glr.cc, * data/skeletons/lalr1.cc: here.	2020-04-01 08:32:58 +02:00
Akim Demaille	7e28dbea11	c++: also use symbol_type_type Because of the insane current implementation of glr.cc, things are a bit nasty. We will rename symbol_number_type as symbol_type_type later, to keep this commit small. * data/skeletons/c++.m4 (b4_declare_symbol_enum): New. Also define YYNTOKENS to avoid type clashes when yyntokens_ was actually defined in another enum. Use it. (symbol_number_type): Be an alias of symbol_type_type. Use YYSYMBOL_YYEMPTY and the like. Use symbol_number_type where appropriate. (empty_symbol): Remove. (yytranslate_): Use symbol_number_type, not token_number_type. * data/skeletons/lalr1.cc: Use symbol_number_type where appropriate. Adjust to the replacement of empty_symbol by YYSYMBOL_YYEMPTY. (yy_error_token_, yy_undef_token_, yyeof_, yyntokens_): Remove. Adjust dependencies. * data/skeletons/glr.cc: Use symbol_number_type where appropriate. Forward definitions of YYSYMBOL_YYEMPTY, etc. to glr.c. * tests/headers.at: Accept YYNTOKENS and other YYSYMBOL_. tests/local.at (AT_YYERROR_DEFINE(c++)): Use symbol_number_type.	2020-04-01 08:32:50 +02:00
Akim Demaille	65df8d6747	glr.c: remove the yySymbol alias * data/skeletons/glr.c: Use yysymbol_type_t only.	2020-04-01 08:31:48 +02:00
Akim Demaille	beea39b2ec	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	086506bf23	glr.c, yacc.c: propagate yysymbol_type_t Now that yacc.c and glr.c both know yysymbol_type_t, convert the common routines. * data/skeletons/c.m4 (yydestruct, yy_symbol_value_print) (yy_symbol_print): Use yysymbol_type_t instead of int. * data/skeletons/glr.c: Use yySymbol where appropriate. * data/skeletons/yacc.c (YY_ACCESSING_SYMBOL): New wrapper around yystos. Use it. * tests/local.at (yyreport_syntax_error): Use yysymbol_type_t where appropriate.	2020-04-01 08:31:48 +02:00
Akim Demaille	39792f57fb	glr.c: use yysymbol_type_t, YYSYMBOL_YYEOF etc. Apply the same changes as in yacc.c. Now yySymbol and yysymbol_type_t are aliases. We will remove the former later, to avoid cluttering this commit. * data/skeletons/glr.c: Use b4_declare_symbol_enum. Use YYSYMBOL_YYEOF etc. where appropriate. (YYUNDEFTOK, YYTERROR): Remove. (YYTRANSLATE, yySymbol, yyexpected_tokens, yysyntax_error_arguments): Adjust. (yy_accessing_symbol): New. Use it where appropriate.	2020-04-01 08:31:48 +02:00
Akim Demaille	65bbaf9598	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	9039c571f4	yacc.c: fix more errors from make maintainer-check-g++ * data/skeletons/yacc.c (yyexpected_tokens): Use casts where needed.	2020-04-01 08:31:48 +02:00
Akim Demaille	d3db22d788	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	9434571f95	yacc.c: revert to not using yysymbol_type_t in the yytranslate table This triggers warnings with several compilers. For instance ICC fills the logs with pages and pages of input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}" 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, ^ input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}" 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, ^ And so does G++9 when compiling yacc.c's (C) output input.c:545:8: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive] 545 \| 0, 5, 9, 2, 2, 2, 2, 2, 2, 2, \| ^ \| \| \| int input.c:545:15: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive] 545 \| 0, 5, 9, 2, 2, 2, 2, 2, 2, 2, \| ^ \| \| \| int Clang++ is no exception input.c:545:8: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int' 0, 5, 9, 2, 2, 2, 2, 2, 2, 2, ^ input.c:545:15: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int' 0, 5, 9, 2, 2, 2, 2, 2, 2, 2, ^ At some point we could use yysymbol_type_t's enumerators to define yytranslate. Meanwhile... * data/skeletons/yacc.c (yytranslate): Use the original integral type to define it. (YYTRANSLATE): Cast the result into yysymbol_type_t.	2020-04-01 08:31:48 +02:00
Akim Demaille	0cdbcee0ce	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	fd37eb057e	yysymbol_type_t: always assign an enumerator Currently we define enumerators only for symbols that have an identifier. That rules out tokens such as '+', and nonterminals such as foo-bar and foo.bar. As a consequence we are taking chances: the compiler might compile yysymbol_type_t as too small an integral type for some symbol codes. * data/skeletons/bison.m4 (b4_symbol_sid): Forge a unique symbol identifier for symbols that don't have an ID.	2020-04-01 08:31:48 +02:00
Akim Demaille	ecc3a13c34	bistromathic: use symbol numbers instead of YYTRANSLATE * examples/c/bistromathic/parse.y: here.	2020-04-01 08:31:48 +02:00
Akim Demaille	04904e4d28	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	75a605454d	yacc.c: prefer YYSYMBOL_YYERROR to YYSYMBOL_error * data/skeletons/bison.m4 (b4_symbol_sid): Map "error" to YYSYMBOL_YYERROR. * data/skeletons/yacc.c: Adjust.	2020-04-01 08:31:48 +02:00
Akim Demaille	d7f39ac507	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	f3c18c8e80	yacc.c: also define a symbol number for the empty token This is not only cleaner, it also protects us from mixing signed values (YYEMPTY is #defined as -2) with unsigned types (the yysymbol_type_t enum is typically compiled as a small unsigned). For instance GCC 9: input.c: In function 'yyparse': input.c:1107:7: error: conversion to 'unsigned int' from 'int' may change the sign of the result [-Werror=sign-conversion] 1107 \| yyn += yytoken; \| ^~ input.c:1107:10: error: conversion to 'int' from 'unsigned int' may change the sign of the result [-Werror=sign-conversion] 1107 \| yyn += yytoken; \| ^~~~~~~ input.c:1108:47: error: comparison of integer expressions of different signedness: 'yytype_int8' {aka 'const signed char'} and 'yysymbol_type_t' {aka 'enum yysymbol_type_t'} [-Werror=sign-compare] 1108 \| if (yyn < 0 \|\| YYLAST < yyn \|\| yycheck[yyn] != yytoken) \| ^~ input.c:702:25: error: operand of ?: changes signedness from 'int' to 'unsigned int' due to unsignedness of other operand [-Werror=sign-compare] 702 \| #define YYEMPTY (-2) \| ^~~~ input.c:1220:33: note: in expansion of macro 'YYEMPTY' 1220 \| yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); \| ^~~~~~~ input.c:1220:41: error: unsigned conversion from 'int' to 'unsigned int' changes value from '-2' to '4294967294' [-Werror=sign-conversion] 1220 \| yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); \| ^ Eventually, it might be interesting to move away from -2 (which is the only possible negative symbol number) and use the next available number, to save bits. We could actually even simply use "0" and shift the rest, which would allow to write "!yytoken" to mean really "yytoken != YYEMPTY". * data/skeletons/c.m4 (b4_declare_symbol_enum): Define YYSYMBOL_YYEMPTY. * data/skeletons/yacc.c: Use it. * src/parse-gram.y (yyreport_syntax_error): Use YYSYMBOL_YYEMPTY, not YYEMPTY, when dealing with a symbol. * tests/regression.at: Adjust.	2020-04-01 08:31:48 +02:00
Akim Demaille	00c80bc96c	yacc.c: use yysymbol_type_t instead of int for yytoken Now that we have a proper type for internal symbol numbers, let's use it. More code needs conversion, e.g., printers and destructors, but they are shared with glr.c, which is not ready yet for this change. It will also help us deal with warnings such as (GCC9 on GNU/Linux): input.c: In function 'int yyparse()': input.c:475:37: error: enumeral and non-enumeral type in conditional expression [-Werror=extra] 475 \| (0 <= (YYX) && (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYSYMBOL_YYUNDEF) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ input.c:1024:17: note: in expansion of macro 'YYTRANSLATE' 1024 \| yytoken = YYTRANSLATE (yychar); \| ^~~~~~~~~~~ * data/skeletons/yacc.c (yytranslate, yysymbol_name) (yyparse_context_t, yyexpected_tokens, yypstate_expected_tokens) (yysyntax_error_arguments): Use yysymbol_type_t instead of int.	2020-04-01 08:31:48 +02:00
Akim Demaille	f62f1db298	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	3ba001baac	yacc.c: introduce an enum that defines the symbol's number There's a number of advantage in exposing the symbol (internal) numbers: - custom error messages can use them to decide how to represent a given symbol, or a set of symbols. - we need something similar in uses of yyexpected_tokens. For instance, currently, bistromathic's completion() reads: int ntokens = expected_tokens (line, tokens, YYNTOKENS); [...] for (int i = 0; i < ntokens; ++i) if (tokens[i] == YYTRANSLATE (TOK_VAR)) [...] else if (tokens[i] == YYTRANSLATE (TOK_FUN)) [...] else [...] - now that it's a compile-time expression, we can easily build static tables, switch, etc. - some users depended on the ability to get the token number from a symbol to write test cases for their scanners. But Bison 3.5 removed the table this feature depended upon (a reverse yytranslate). Now they can check against the actual symbol number, without having pay (space and time) a conversion. See https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html, and https://lists.gnu.org/archive/html/bug-bison/2020-03/msg00015.html. - it helps us clearly separate the internal symbol numbers from the external token numbers, whose difference is sometimes blurred in the code when values coincide (e.g. "yychar = yytoken = YYEOF"). - it allows us to get rid of ugly macros with inconsistent names such as YYUNDEFTOK and YYTERROR, and to group related definitions together. - similarly it provides a clean access to the $accept symbol (which proves convenient in a current experimentation of mine with several %start symbols). Let's declare this type as a private type (in the .c file, not the .h one). So it does not need to be influenced by the api prefix. * data/skeletons/bison.m4 (b4_symbol_sid): New. (b4_symbol): Use it. * data/skeletons/c.m4 (b4_symbol_enum, b4_declare_symbol_enum): New. * data/skeletons/yacc.c: Use b4_declare_symbol_enum. (YYUNDEFTOK, YYTERROR): Remove. Use the corresponding symbol enum instead.	2020-04-01 08:31:33 +02:00
Akim Demaille	4140320a0a	style: comment changes about token numbers * data/skeletons/bison.m4, data/skeletons/c.m4: here.	2020-03-30 08:41:12 +02:00
Akim Demaille	af19fd7e0f	tests: recheck: work properly when the test suite was interrupted * tests/local.mk (recheck): Look at the per-test logs, not the overall log, which, when interrupted, contains only information about... the tests that passed.	2020-03-30 08:41:12 +02:00
Akim Demaille	2c74872991	java: move away from _ for internationalization The "_" is becoming a keyword in Java, which causes tons of warnings currently in our test suite. GNU Gettext is now using "i18n" instead of "_" (https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=e89fea36545f27487d9652a13e6a0adbea1117d0). * data/skeletons/java.m4: Use "i18n", not "_". * examples/java/calc/Calc.y, tests/calc.at: Adjust.	2020-03-30 08:03:10 +02:00
Akim Demaille	50517d578c	regen	2020-03-28 15:13:27 +01:00
Akim Demaille	59d820d1ef	c: use YYNOMEM instead of -2 See `84b1972c96`. * data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): New. Use it.	2020-03-28 15:13:27 +01:00
Akim Demaille	90f0500ef8	todo: update * TODO (Token Number): We have to clean this. (Naming conventions, Symbol numbers): New. (Bad styling): Addressed in `e21ff47f5d`.	2020-03-28 15:13:27 +01:00
Akim Demaille	17a9542c4f	regen	2020-03-28 15:13:27 +01:00
Akim Demaille	b7045aa706	java: make yysyntaxErrorArguments a private detail * data/skeletons/lalr1.java (yysyntaxErrorArguments): Move it from the context, to the parser object. Generate only for detailed and verbose error messages. * tests/local.at (AT_YYERROR_DEFINE(java)): Use yyexpectedTokens instead.	2020-03-28 15:13:27 +01:00
Akim Demaille	ee56b6e0f2	skeletons: make yysyntax_error_arguments a private detail We could just "inline yysyntax_error_arguments back" in the routines it was originally extracted from, but I think the code is nicer to read this way. * data/skeletons/glr.c (yysyntax_error_arguments): Generate only for detailed and verbose error messages. * data/skeletons/yacc.c: Likewise. * data/skeletons/lalr1.cc (parser::context::yysyntax_error_arguments): Move as... (parser::yysyntax_error_arguments_): this. And only for detailed and verbose error messages.	2020-03-28 15:13:27 +01:00
Akim Demaille	1edc98f793	lalr1.cc: avoid using yysyntax_error_arguments * data/skeletons/lalr1.cc (context::token): New. * tests/local.at (yyreport_syntax_error): Don't use yysyntax_error_arguments.	2020-03-28 15:13:27 +01:00
Akim Demaille	4192de1f41	bison: avoid using yysyntax_error_arguments * src/parse-gram.y (yyreport_syntax_error): Use yyparse_context_token and yyexpected_tokens.	2020-03-28 15:13:27 +01:00
Akim Demaille	00b0d02955	tests: yacc.c: avoid yysyntax_error_arguments Because glr.c shares the same testing routines, we also need to convert it. * data/skeletons/glr.c (yyparse_context_token): New. * tests/local.at (yyreport_syntax_error): here.	2020-03-28 15:13:27 +01:00
Akim Demaille	1045c8d0ef	examples: don't use yysyntax_error_arguments Suggested by Adrian Vogelsgesang. https://lists.gnu.org/archive/html/bison-patches/2020-02/msg00069.html * data/skeletons/lalr1.java (Context.EMPTY, Context.getToken): New. (Context.yyntokens): Rename as... (Context.NTOKENS): this. Because (i) all the Java coding styles recommend upper case for constants, and (ii) the Java Skeleton exposes Lexer.EOF, not Lexer.YYEOF. * data/skeletons/yacc.c (yyparse_context_token): New. * examples/c/bistromathic/parse.y (yyreport_syntax_error): Don't use yysyntax_error_arguments. * examples/java/calc/Calc.y (yyreportSyntaxError): Likewise.	2020-03-28 15:13:27 +01:00
Akim Demaille	ef8965b5f5	skeletons: fix incorrect type for translatable tokens * data/skeletons/glr.c, data/skeletons/lalr1.c, data/skeletons/yacc.c: Fix confusion between the "translatable" and the "translate" tables.	2020-03-28 15:13:27 +01:00
Akim Demaille	84b1972c96	yacc.c: use negative numbers for errors in auxiliary functions yyparse returns 0, 1, 2 since ages (accept, reject, memory exhausted). Some of our auxiliary functions such as yy_lac and yyreport_syntax_error also need to return error codes and also use 0, 1, 2. Because it uses yy_lac, yyexpected_tokens also needs to return "problem", "memory exhausted", but in case of success, it needs to return the number of tokens, so it cannot use 1 and 2 as error code. Currently it uses -1 and -2, which is later converted into 1 and 2 as yacc.c expects it. Let's simplify this and use consistently -1 and -2 for auxiliary functions that are not exposed (or not yet exposed) to the user. In particular this will save the user from having to convert yyexpected_tokens's -2 into yyreport_syntax_error's 2: both return -1 or -2. * data/skeletons/yacc.c (yy_lac, yyreport_syntax_error) (yy_lac_stack_realloc): Return -1, -2 for errors instead of 1, 2. Adjust callers. * examples/c/bistromathic/parse.y (yyreport_syntax_error): Do take error codes into account. Issue a syntax error message even if we ran out of memory. * src/parse-gram.y, tests/local.at (yyreport_syntax_error): Adjust.	2020-03-23 07:02:36 +01:00
Akim Demaille	1079595b2a	style: reduce length of private constant * data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c (YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as... (YYARGS_MAX): this. * src/parse-gram.y (YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as... (ARGS_MAX): this.	2020-03-23 07:02:34 +01:00
Akim Demaille	e364bcdbc5	doc: c++: promote api.token.raw * doc/bison.texi (Calc++ Parser): Here.	2020-03-23 07:02:32 +01:00
Akim Demaille	5a8db8a739	bench: calc: no need for super long inputs * etc/bench.pl.in ($iterations): Restore initial value, -1, meaning "at least one second". ($calc_input): There is no need to generate 400 lines.	2020-03-22 15:59:22 +01:00
Akim Demaille	5acc29041e	bench: calc: work on a string instead of a file The cost of the file layer is large and makes benchmarks too coarse, as seen for in following example, first with a file, then with a literal string: 0. %skeleton "yacc.c" %define parse.lac full 1. %skeleton "yacc-v1.c" %define nofinal %define parse.lac full 2. %skeleton "yacc-v2.c" %define nofinal %define parse.lac full 3. %skeleton "yacc-v3.c" %define nofinal %define parse.lac full 4. %skeleton "yacc.c" 5. %skeleton "yacc-v1.c" %define nofinal 6. %skeleton "yacc-v2.c" %define nofinal 7. %skeleton "yacc-v3.c" %define nofinal -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- BM_y0 32558 ns 32537 ns 21228 BM_y1 32400 ns 32369 ns 21233 BM_y2 33485 ns 33464 ns 20625 BM_y3 32139 ns 32125 ns 21446 BM_y4 31343 ns 31329 ns 21747 BM_y5 31344 ns 31317 ns 22035 BM_y6 31287 ns 31255 ns 22039 BM_y7 31387 ns 31373 ns 22178 -------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------- BM_y0 10642 ns 10634 ns 63601 BM_y1 10657 ns 10654 ns 63625 BM_y2 10441 ns 10432 ns 65957 BM_y3 10558 ns 10554 ns 64546 BM_y4 9521 ns 9516 ns 72011 BM_y5 9179 ns 9157 ns 75028 BM_y6 9360 ns 9356 ns 73770 BM_y7 9365 ns 9359 ns 72609 Of course, at the same time it is less realistic: most users read files rather that strings, so it might lead to us to pay attention to costs most people don't see. * etc/bench.pl.in (&calc_input): Output into a file given as argument. Output in C syntax. (&generate_grammar_calc): Use it. Simplify the grammar: remove operators we don't care about. Rewrite the scanner to work on a char* instead of a FILE*.	2020-03-22 15:59:22 +01:00
Akim Demaille	5b0b0a1e08	bench: add a "latest" symlink * etc/bench.pl.in: here.	2020-03-22 15:59:14 +01:00
Akim Demaille	1c694e08cc	bench: use the same prefix in both bench methods * etc/bench.pl.in (&bench_with_timethese): Also use y$i, as in &bench_with_gbenchmark. (&generate_grammar_calc): Don't add a prefix, let the callers do it.	2020-03-22 15:59:13 +01:00
Akim Demaille	4cfb067d93	bench: use a C++-11 compiler See https://github.com/google/benchmark#a-faster-keeprunning-loop. * etc/bench.pl.in ($cxx): Be C++11. (&bench_with_gbenchmark): Adjust.	2020-03-22 15:59:13 +01:00
Akim Demaille	cf60d0a617	bench: create a README file with benches * etc/bench.pl.in (&bench_with_gbenchmark): Here.	2020-03-22 15:59:13 +01:00
Akim Demaille	c0e8489605	bench: calc: add support for google benchmark * etc/bench.pl.in (&compiler): New, extracted from... (&compile): here. Don't link when using gbm. (&calc_input): Don't make massive input for micro benchmarks. (&generate_grammar_calc): When using gbm, use api.prefix to avoid name collisions. Be ready to issue BENCHMARKS instead of a main. (&bench): Rename as... (&bench_with_timethese): this. (&bench_with_gbenchmark): New. (&bench): New. Dispatch on these two.	2020-03-21 18:19:14 +01:00
Akim Demaille	788b1a6858	bench: better error messages on invalid input * etc/bench.pl.in: here.	2020-03-21 18:17:09 +01:00

1 2 3 4 5 ...

7076 Commits