bison

mirror of https://git.savannah.gnu.org/git/bison.git synced 2026-07-24 11:50:33 +00:00

Author	SHA1	Message	Date
Akim Demaille	04904e4d28	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	d7f39ac507	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	f3c18c8e80	yacc.c: also define a symbol number for the empty token This is not only cleaner, it also protects us from mixing signed values (YYEMPTY is #defined as -2) with unsigned types (the yysymbol_type_t enum is typically compiled as a small unsigned). For instance GCC 9: input.c: In function 'yyparse': input.c:1107:7: error: conversion to 'unsigned int' from 'int' may change the sign of the result [-Werror=sign-conversion] 1107 \| yyn += yytoken; \| ^~ input.c:1107:10: error: conversion to 'int' from 'unsigned int' may change the sign of the result [-Werror=sign-conversion] 1107 \| yyn += yytoken; \| ^~~~~~~ input.c:1108:47: error: comparison of integer expressions of different signedness: 'yytype_int8' {aka 'const signed char'} and 'yysymbol_type_t' {aka 'enum yysymbol_type_t'} [-Werror=sign-compare] 1108 \| if (yyn < 0 \|\| YYLAST < yyn \|\| yycheck[yyn] != yytoken) \| ^~ input.c:702:25: error: operand of ?: changes signedness from 'int' to 'unsigned int' due to unsignedness of other operand [-Werror=sign-compare] 702 \| #define YYEMPTY (-2) \| ^~~~ input.c:1220:33: note: in expansion of macro 'YYEMPTY' 1220 \| yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); \| ^~~~~~~ input.c:1220:41: error: unsigned conversion from 'int' to 'unsigned int' changes value from '-2' to '4294967294' [-Werror=sign-conversion] 1220 \| yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar); \| ^ Eventually, it might be interesting to move away from -2 (which is the only possible negative symbol number) and use the next available number, to save bits. We could actually even simply use "0" and shift the rest, which would allow to write "!yytoken" to mean really "yytoken != YYEMPTY". * data/skeletons/c.m4 (b4_declare_symbol_enum): Define YYSYMBOL_YYEMPTY. * data/skeletons/yacc.c: Use it. * src/parse-gram.y (yyreport_syntax_error): Use YYSYMBOL_YYEMPTY, not YYEMPTY, when dealing with a symbol. * tests/regression.at: Adjust.	2020-04-01 08:31:48 +02:00
Akim Demaille	00c80bc96c	yacc.c: use yysymbol_type_t instead of int for yytoken Now that we have a proper type for internal symbol numbers, let's use it. More code needs conversion, e.g., printers and destructors, but they are shared with glr.c, which is not ready yet for this change. It will also help us deal with warnings such as (GCC9 on GNU/Linux): input.c: In function 'int yyparse()': input.c:475:37: error: enumeral and non-enumeral type in conditional expression [-Werror=extra] 475 \| (0 <= (YYX) && (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYSYMBOL_YYUNDEF) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ input.c:1024:17: note: in expansion of macro 'YYTRANSLATE' 1024 \| yytoken = YYTRANSLATE (yychar); \| ^~~~~~~~~~~ * data/skeletons/yacc.c (yytranslate, yysymbol_name) (yyparse_context_t, yyexpected_tokens, yypstate_expected_tokens) (yysyntax_error_arguments): Use yysymbol_type_t instead of int.	2020-04-01 08:31:48 +02:00
Akim Demaille	f62f1db298	regen	2020-04-01 08:31:48 +02:00
Akim Demaille	50517d578c	regen	2020-03-28 15:13:27 +01:00
Akim Demaille	17a9542c4f	regen	2020-03-28 15:13:27 +01:00
Akim Demaille	4192de1f41	bison: avoid using yysyntax_error_arguments * src/parse-gram.y (yyreport_syntax_error): Use yyparse_context_token and yyexpected_tokens.	2020-03-28 15:13:27 +01:00
Akim Demaille	84b1972c96	yacc.c: use negative numbers for errors in auxiliary functions yyparse returns 0, 1, 2 since ages (accept, reject, memory exhausted). Some of our auxiliary functions such as yy_lac and yyreport_syntax_error also need to return error codes and also use 0, 1, 2. Because it uses yy_lac, yyexpected_tokens also needs to return "problem", "memory exhausted", but in case of success, it needs to return the number of tokens, so it cannot use 1 and 2 as error code. Currently it uses -1 and -2, which is later converted into 1 and 2 as yacc.c expects it. Let's simplify this and use consistently -1 and -2 for auxiliary functions that are not exposed (or not yet exposed) to the user. In particular this will save the user from having to convert yyexpected_tokens's -2 into yyreport_syntax_error's 2: both return -1 or -2. * data/skeletons/yacc.c (yy_lac, yyreport_syntax_error) (yy_lac_stack_realloc): Return -1, -2 for errors instead of 1, 2. Adjust callers. * examples/c/bistromathic/parse.y (yyreport_syntax_error): Do take error codes into account. Issue a syntax error message even if we ran out of memory. * src/parse-gram.y, tests/local.at (yyreport_syntax_error): Adjust.	2020-03-23 07:02:36 +01:00
Akim Demaille	1079595b2a	style: reduce length of private constant * data/skeletons/glr.c, data/skeletons/lalr1.cc, data/skeletons/yacc.c (YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as... (YYARGS_MAX): this. * src/parse-gram.y (YYERROR_VERBOSE_ARGS_MAXIMUM): Rename as... (ARGS_MAX): this.	2020-03-23 07:02:34 +01:00
Akim Demaille	466fb66578	regen	2020-03-17 19:21:24 +01:00
Akim Demaille	951da960e6	merge branch 'maint' * upstream/maint: maint: post-release administrivia version 3.5.3 news: update for 3.5.3 yacc.c: make sure we properly propagated the user's number for error diagnostics: don't crash because of repeated definitions of error style: initialize some struct members diagnostics: beware of zero-width characters diagnostics: be sure to close the styling when lines are too short muscles: fix incorrect decoding of $ code: be robust to reference with invalid tags build: fix typo doc: update recommandation for libtextstyle style: comment changes examples: use consistently the GFDL header for readmes style: remove useless declarations typo: succesful -> successful README: point to tests/bison, and document --trace gnulib: update maint: post-release administrivia	2020-03-08 10:13:16 +01:00
Akim Demaille	cfcd823e16	diagnostics: don't crash because of repeated definitions of error According to https://www.unix.com/man-page/POSIX/1posix/yacc/, the user is allowed to specify her user number for the error token: The token error shall be reserved for error handling. The name error can be used in grammar rules. It indicates places where the parser can recover from a syntax error. The default value of error shall be 256. Its value can be changed using a %token declaration. The lexical analyzer should not return the value of error. I think this feature is useless, the user should not have to deal with that. The intend is probably to give the user a means to use 256 if she wants to, but provided "error" cleared the path first by being assigned another number. In the case of Bison, 256 is assigned to "error" at the end if the user did not use it for a token of hers. So this feature is useless. Yet it is valid, and if the user assigns twice a token number to "error", then the second time we want to complain about it and want to show the original definition. At this point, we try to display the built-in definition of "error", whose location is NULL, and we crash. Rather, the location of the first user definition of "error" should become its defining location. Reported byg Ahcheong Lee. https://lists.gnu.org/r/bug-bison/2020-03/msg00007.html * src/symtab.c (symbol_class_set): If this is a declaration and the symbol was not declared yet, keep this as defining location. * tests/input.at (Redefining the error token): New.	2020-03-08 08:10:11 +01:00
Akim Demaille	2f02d9beae	style: initialize some struct members * src/symtab.c (sym_content_new): Initialize all the location members. Not needed by the code, but disturbing values when using a debugger.	2020-03-08 08:10:11 +01:00
Akim Demaille	b638603477	diagnostics: beware of zero-width characters Currenly we rely on (visual) width of the characters to decide where to open and close the styling of the quoted lines. This breaks when we deal with zero-width characters: we cannot just rely on (visual) columns, we need to know whether we are before, inside, or after the highlighted portion. * src/location.c (location_caret): col_end: no longer add 1, "regular" characters have a width of 1, only 0-width characters have 0-width. opened: replace with 'state', a three-valued enum. Don't reopen the style if we already did. * tests/diagnostics.at (Zero-width characters): New.	2020-03-08 08:10:11 +01:00
Akim Demaille	e21ff47f5d	diagnostics: be sure to close the styling when lines are too short bar.y:4.12-17: <error>error:</error> redefining user token number of foo - 4 \| %token foo <error>123 + 4 \| %token foo <error>123</error> \| <error>^~~~~~</error> * src/location.c (location_caret): Be sure to close. * tests/diagnostics.at (Line is too short, and then you die): New.	2020-03-07 10:01:52 +01:00
Akim Demaille	b82b387da9	muscles: fix incorrect decoding of $ Bug introduced in `458171e6df`. https://lists.gnu.org/archive/html/bison-patches/2013-11/msg00009.html Reported by Ahcheong Lee. https://lists.gnu.org/r/bug-bison/2020-03/msg00010.html * src/muscle-tab.c (COMMON_DECODE): "$" is coded as "$][", not "$[][". * tests/input.at ("%define" enum variables): Check that case.	2020-03-07 07:45:10 +01:00
Akim Demaille	641e326303	code: be robust to reference with invalid tags Because we want to support $<a->b>$, we must accept -> in type tags, and reject $<->$, as it is unfinished. Reported by Ahcheong Lee. * src/scan-code.l (yylex): Make sure "tag" does not end with -, since -> does not close the tag. * tests/input.at (Stray $ or @): Check this.	2020-03-06 17:29:26 +01:00
Akim Demaille	666df338a7	style: comment changes * src/symtab.h, src/lr0.c: here.	2020-03-06 08:32:03 +01:00
Akim Demaille	b493c173c9	style: remove useless declarations * src/reader.h: Don't duplicate what parse-gram.h already exposes. * src/lr0.h: Remove useless include.	2020-03-06 08:30:21 +01:00
Adrian VogelsgesangandAkim Demaille	aab3feb5a1	typo: succesful -> successful * data/skeletons/lalr1.cc: here * etc/bench.pl.in: here * src/location.c: and here.	2020-03-06 08:29:58 +01:00
Akim Demaille	30d01b21e7	c++: minor fixes Address compiler warnings such as warning: declaration of 'yyla' shadows a member of 'yy::parser::context' [-Wshadow] * data/skeletons/lalr1.cc (context): Don't use the same names for variables and members. Use foo_ for private members, as in parser. Also, use the + trick in array accesses to please ICC and provide it with an int.	2020-02-27 21:31:09 +01:00
Adrian VogelsgesangandAkim Demaille	368fcf0af5	typo: succesful -> successful * data/skeletons/lalr1.cc: here * etc/bench.pl.in: here * src/location.c: here * tests/calc.at: and here	2020-02-27 18:10:39 +01:00
Akim Demaille	296660304c	style: comment changes * src/symtab.h, src/lr0.c: here.	2020-02-23 08:25:53 +01:00
Akim Demaille	323731ff74	style: avoid using 'this' as an identifier LLDB insists on parsing 'this' as a C++ keyword, even when debugging a C program. * src/symtab.c: Please the dictator.	2020-02-23 08:25:53 +01:00
Akim Demaille	6f7f949708	style: remove useless declarations * src/reader.h: Don't duplicate what parse-gram.h already exposes. * src/lr0.h: Remove useless include.	2020-02-23 08:25:53 +01:00
Akim Demaille	7a28659495	regen	2020-02-15 08:28:57 +01:00
Victor Morales CayuelaandAkim Demaille	e09a72eeb0	diagnostics: modernize the display of submessages Since Bison 2.7, output was indented four spaces for explanatory statements. For example: input.y:2.7-13: error: %type redeclaration for exp input.y:1.7-11: previous declaration Since the introduction of caret-diagnostics, it became less clear. Remove the indentation and display submessages as in GCC: input.y:2.7-13: error: %type redeclaration for exp 2 \| %type <float> exp \| ^~~~~~~ input.y:1.7-11: note: previous declaration 1 \| %type <int> exp \| ^~~~~ * src/complain.h (SUB_INDENT): Remove. (warnings): Add "note" to the enum. * src/complain.h, src/complain.c (complain_indent): Replace by... (subcomplain): this. Adjust all dependencies. * tests/actions.at, tests/diagnostics.at, tests/glr-regression.at, * tests/input.at, tests/named-refs.at, tests/regression.at: Adjust expectations.	2020-02-15 08:28:40 +01:00
Akim Demaille	cc3760ef51	news: 3.5.2 * NEWS: Update.	2020-02-13 18:25:11 +01:00
Akim Demaille	8637f2c7d6	build: pacify syntax-check * src/complain.c: Fix indentation. * cfg.mk: Using strcmp is ok in the tests. Test cases and examples don't need Bison's PO support.	2020-02-10 20:46:56 +01:00
Akim Demaille	6946149701	regen	2020-02-10 20:42:23 +01:00
Akim Demaille	18a7cfc7cf	java: make the syntax error format string translatable The error format should be translated, but contrary to the case of C/C++, we cannot just depend on macros to adapt on the presence/absence of '_'. Let's consider that the message format is to be translated iff there are some internationalized tokens. * src/output.c (prepare_symbol_names): Define b4_has_translations. * data/skeletons/java.m4 (b4_trans): New. * data/skeletons/lalr1.java: Use it to emit translatable or not the format string.	2020-02-08 11:24:53 +01:00
Akim Demaille	52db24b2bc	java: add support for parse.error=detailed In Java there is no need for N_ and yytranslate_. So instead of hard-coding the use of N_ in the table of the symbol names, rely on b4_symbol_translate. * src/output.c (prepare_symbol_names): Use b4_symbol_translate instead of N_. * data/skeletons/c.m4 (b4_symbol_translate): New. * data/skeletons/lalr1.java (yysymbolName): New. Use it. * examples/java/calc/Calc.y: Use parse.error=detailed. * tests/calc.at: Check parse.error=detailed.	2020-02-08 11:24:53 +01:00
Akim Demaille	e6b0612f91	bison: pretend to 3.6 already * src/parse-gram.y: here.	2020-01-26 13:29:18 +01:00
Akim Demaille	81849520cd	regen	2020-01-23 08:30:51 +01:00
Akim Demaille	fc2191f137	diagnostics: modernize bison's syntax errors We used to display the unexpected token first: $ bison foo.y foo.y:1.8-13: error: syntax error, unexpected %token, expecting character literal or identifier or <tag> 1 \| %token %token \| ^~~~~~ GCC uses a different format: $ gcc-mp-9 foo.c foo.c:1:5: error: expected identifier or '(' before ')' token 1 \| int()()() \| ^ and so does Clang: $ clang-mp-9.0 foo.c foo.c:1:5: error: expected identifier or '(' int()()() ^ 1 error generated. They display the unexpected token last (or not at all). Also, they don't waste width with "syntax error". Let's try that. It gives, for the same example as above: $ bison foo.y foo.y:1.8-13: error: expected character literal or identifier or <tag> before %token 1 \| %token %token \| ^~~~~~ * src/complain.h, src/complain.c (syntax_error): New. * src/parse-gram.y (yyreport_syntax_error): Use it.	2020-01-23 08:30:28 +01:00
Akim Demaille	6fb362c87a	regen	2020-01-23 08:26:33 +01:00
Akim Demaille	46ab1d0cbe	diagnostics: report syntax errors in color * src/parse-gram.y (parse.error): Set to 'custom'. (yyreport_syntax_error): New. * data/bison-default.css (.expected, .unexpected): New. * tests/diagnostics.at: Adjust.	2020-01-23 08:26:33 +01:00
Akim Demaille	f54a5b303b	regen	2020-01-23 08:26:33 +01:00
Akim Demaille	2cc361387c	diagnostics: translate bison's own tokens As a test case, support translations in Bison itself. * src/parse-gram.y: Mark the translatable tokens. While at it, use clearer names. * tests/input.at: Adjust expectations.	2020-01-23 08:26:28 +01:00
Akim Demaille	e6d1289f4a	diagnostics: handle -fno-caret in the called functions Don't force callers of location_caret to have to deal with flags that disable it. * src/location.h, src/location.c (location_caret) (location_caret_suggestion): Early return if disabled. * src/complain.c: Simplify.	2020-01-22 22:31:41 +01:00
Akim Demaille	6ada985ff3	parsers: issue tname with i18n markup Some users would like to avoid having to "parse" the .y file to find the strings to translate. Let's issue the translatable tokens with N_ to allow "parsing" the generated parsers instead. See https://lists.gnu.org/archive/html/bison-patches/2019-01/msg00015.html src/output.c (prepare_symbol_names): Issue symbol_names with N_() markup.	2020-01-19 21:23:11 +01:00
Akim Demaille	1db962716a	regen	2020-01-19 21:23:11 +01:00
Akim Demaille	9096955fba	parsers: support translatable token aliases In addition to %token NUM "number" accept %token NUM _("number") in which case the token will be translated in error messages. Do not use _() in the output if there are no translatable tokens. * src/symtab.h, src/symtab.c (symbol): Add a 'translatable' member. * src/parse-gram.y (TSTRING): New token. (string_as_id.opt): Replace with... (alias): this. Use it. * src/scan-gram.l (SC_ESCAPED_TSTRING): New start conditions, to match TSTRINGs. * src/output.c (prepare_symbols): Define b4_translatable if there are translatable strings. * data/skeletons/glr.c, data/skeletons/lalr1.cc, * data/skeletons/yacc.c (yytnamerr): Receive b4_translatable, and use it.	2020-01-19 21:23:11 +01:00
Akim Demaille	d9df62bfcd	yacc.c: escape trigraphs in detailed parse.error * src/output.c (escape_trigraphs, xescape_trigraphs): New. (prepare_symbol_names): Use it. * tests/regression.at: Check the handling of trigraphs with parse.error = detailed.	2020-01-19 21:22:41 +01:00
Akim Demaille	adac9a17f0	regen	2020-01-19 14:51:14 +01:00
Akim Demaille	3b4b157369	bison: use detailed error messages * #: .	2020-01-19 14:51:14 +01:00
Akim Demaille	5c332d70d7	regen	2020-01-19 14:51:14 +01:00
Akim Demaille	f443673450	yacc.c: add support for parse.error detailed "detailed" error messages are almost like "verbose", except that we don't double escape them, they don't get inner quotes, we don't use yytnamerr, and we hide the table. "custom" is exposed with the "detailed" tokens, not the "verbose" ones: they are not double-quoted. Because there's a risk that some people use yytname even without "verbose", let's keep yytname (instead of yys_name) in "simple" parse.error. * src/output.c (prepare_symbol_names): Be ready to output symbol names unquoted. (prepare_symbol_names): Output both the old tname table, and the new symbol_names one. * data/skeletons/bison.m4: Accept 'detailed'. * data/skeletons/yacc.c: When parse.error is 'detailed', don't emit yytname and yytnamerr, just yysymbol_name with the table inside. * tests/calc.at: Adjust.	2020-01-19 14:51:14 +01:00
Akim Demaille	8e6233353f	c: use yysymbol_name in traces Only parse.error verbose and simple will get the original yytname: the other options will rely on a different table. So let's move on top of the yysymbol_name function. * data/skeletons/c.m4 (yy_symbol_print): Use yysymbol_name. * data/skeletons/glr.c (yytokenName): Rename as... (yysymbol_name): this. The change of naming scheme is unfortunate, but it's definitely glr.c which is "wrong".	2020-01-19 14:51:14 +01:00

1 2 3 4 5 ...