Commit Graph

7138 Commits

Author SHA1 Message Date
Akim Demaille
f9c73eec5f readme: more about the coding style 2020-04-04 19:20:29 +02:00
Akim Demaille
f6dcecb287 java: fixes in SymbolType
Reported by Paolo Bonzini.
https://github.com/akimd/bison/pull/34#issuecomment-609029634

* data/skeletons/java.m4 (SymbolType): Use 'final' where possible.
(get): Rewrite on top of an array instead of a switch.
2020-04-04 17:11:02 +02:00
Akim Demaille
ad31c3cdf4 java: use SymbolType
The Java enums are very different from the C model.  As a consequence,
one cannot "build" an enum directly from an integer, we must retrieve
it.  That's the purpose of the SymbolType.get class method.

* data/skeletons/java.m4 (b4_symbol_enum, b4_case_code_symbol)
(b4_declare_symbol_enum): New.
* data/skeletons/lalr1.java: Use SymbolType,
SymbolType.YYSYMBOL_YYEMPTY, etc.
* examples/java/calc/Calc.y, tests/local.at: Adjust.
2020-04-04 16:42:33 +02:00
Akim Demaille
7fa23136ca examples: java: use explicit token identifiers
* examples/java/calc/Calc.y: Declare all the tokens, so that we are
compatibile with api.token.raw.
* examples/java/calc/Calc.test: Adjust.
2020-04-04 16:42:33 +02:00
Akim Demaille
961ea2ac85 news: announce that Bison 3.6 drops YYERROR_VERBOSE
* NEWS: here.
2020-04-04 14:52:58 +02:00
Akim Demaille
cc6e5cf854 news: update for 3.5.4 2020-04-04 10:56:47 +02:00
Akim Demaille
1376a7c6e2 style: fix spellos
* src/complain.c, src/print.c, src/print-xml.c, src/symtab.h: here.
2020-04-04 10:56:47 +02:00
Adrian Vogelsgesang
1c273826d4 typo: succesful -> successful
* tests/calc.at: Here.
2020-04-04 10:56:47 +02:00
Akim Demaille
4a55a5ea9a package: improve the readme
* README: Describe what Bison is.
2020-04-04 10:56:46 +02:00
Akim Demaille
72f04ca80f java: check and fix support for api.token.raw
* tests/local.at (AT_LANG_MATCH, AT_YYERROR_DECLARE(java))
(AT_YYERROR_DECLARE_EXTERN(java), AT_PARSER_CLASS): New.
(AT_MAIN_DEFINE(java)): Use AT_PARSER_CLASS.
* tests/scanner.at: Add a test for Java.
* data/skeletons/lalr1.java (yytranslate_): Cast the result.
2020-04-04 10:34:53 +02:00
Akim Demaille
fd98afaf10 d: use the SymbolType enum for symbol kinds
* data/skeletons/d.m4 (b4_symbol_enum, b4_declare_symbol_enum): New.
* data/skeletons/lalr1.d: Use them.
Use SymbolType, SymbolType.YYSYMBOL_YYEMPTY etc. where appropriate.
(undef_token_, token_number_type, yy_error_token_): Remove.
2020-04-04 10:31:50 +02:00
Akim Demaille
cca8c73431 java: style: prefer 'int[] foo' to 'int foo[]'
* data/skeletons/java.m4 (b4_typed_parser_table_define): Here.
2020-04-04 08:08:07 +02:00
Akim Demaille
cb40f5c624 build: fix syntax-check issues
* src/system.h, tests/local.mk: Fix indentation.
2020-04-04 08:04:11 +02:00
Akim Demaille
6c23b012b9 tests: recheck: work properly when the test suite was interrupted
* tests/local.mk (recheck): Look at the per-test logs, not the overall
log, which, when interrupted, contains only information about... the
tests that passed.
2020-04-02 07:32:48 +02:00
Akim Demaille
ef88dfba81 doc: c++: promote api.token.raw
* doc/bison.texi (Calc++ Parser): Here.
2020-04-02 07:32:01 +02:00
Akim Demaille
6e89bc0fd2 build: fix compatibility with old compilers
GCC 4.2 dies with

    src/InadequacyList.c: In function 'InadequacyList__new_conflict':
    src/InadequacyList.c:37: error: #pragma GCC diagnostic not allowed inside functions
    src/InadequacyList.c:37: error: #pragma GCC diagnostic not allowed inside functions
    src/InadequacyList.c:40: error: #pragma GCC diagnostic not allowed inside functions

Reported by Evan Lavelle.
See https://lists.gnu.org/r/bug-bison/2020-03/msg00021.html
and https://trac.macports.org/ticket/59927.

* src/system.h (GCC_VERSION): New.
Use it to control IGNORE_TYPE_LIMITS_BEGIN and
IGNORE_TYPE_LIMITS_END.
2020-04-02 07:16:44 +02:00
Akim Demaille
e3e21cc0d8 examples: reccalc: compile cleanly in C99
See https://trac.macports.org/ticket/59927.

* examples/c/reccalc/parse.y: C99 does not allow multiple typedefs.
2020-04-02 07:14:19 +02:00
Akim Demaille
a0ee2a7543 c++: replace symbol_number_type with symbol_type_type
* data/skeletons/c++.m4, data/skeletons/glr.cc,
* data/skeletons/lalr1.cc: here.
2020-04-01 08:32:58 +02:00
Akim Demaille
7e28dbea11 c++: also use symbol_type_type
Because of the insane current implementation of glr.cc, things are a
bit nasty.  We will rename symbol_number_type as symbol_type_type
later, to keep this commit small.

* data/skeletons/c++.m4 (b4_declare_symbol_enum): New.
Also define YYNTOKENS to avoid type clashes when yyntokens_ was
actually defined in another enum.
Use it.
(symbol_number_type): Be an alias of symbol_type_type.
Use YYSYMBOL_YYEMPTY and the like.
Use symbol_number_type where appropriate.
(empty_symbol): Remove.
(yytranslate_): Use symbol_number_type, not token_number_type.
* data/skeletons/lalr1.cc: Use symbol_number_type where appropriate.
Adjust to the replacement of empty_symbol by YYSYMBOL_YYEMPTY.
(yy_error_token_, yy_undef_token_, yyeof_, yyntokens_): Remove.
Adjust dependencies.

* data/skeletons/glr.cc: Use symbol_number_type where appropriate.
Forward definitions of YYSYMBOL_YYEMPTY, etc. to glr.c.

* tests/headers.at: Accept YYNTOKENS and other YYSYMBOL_*.
* tests/local.at (AT_YYERROR_DEFINE(c++)): Use symbol_number_type.
2020-04-01 08:32:50 +02:00
Akim Demaille
65df8d6747 glr.c: remove the yySymbol alias
* data/skeletons/glr.c: Use yysymbol_type_t only.
2020-04-01 08:31:48 +02:00
Akim Demaille
beea39b2ec regen 2020-04-01 08:31:48 +02:00
Akim Demaille
086506bf23 glr.c, yacc.c: propagate yysymbol_type_t
Now that yacc.c and glr.c both know yysymbol_type_t, convert the
common routines.

* data/skeletons/c.m4 (yydestruct, yy_symbol_value_print)
(yy_symbol_print): Use yysymbol_type_t instead of int.
* data/skeletons/glr.c: Use yySymbol where appropriate.
* data/skeletons/yacc.c (YY_ACCESSING_SYMBOL): New wrapper around
yystos.
Use it.
* tests/local.at (yyreport_syntax_error): Use yysymbol_type_t where
appropriate.
2020-04-01 08:31:48 +02:00
Akim Demaille
39792f57fb glr.c: use yysymbol_type_t, YYSYMBOL_YYEOF etc.
Apply the same changes as in yacc.c.  Now yySymbol and yysymbol_type_t
are aliases.  We will remove the former later, to avoid cluttering
this commit.

* data/skeletons/glr.c: Use b4_declare_symbol_enum.
Use YYSYMBOL_YYEOF etc. where appropriate.
(YYUNDEFTOK, YYTERROR): Remove.
(YYTRANSLATE, yySymbol, yyexpected_tokens, yysyntax_error_arguments):
Adjust.
(yy_accessing_symbol): New.
Use it where appropriate.
2020-04-01 08:31:48 +02:00
Akim Demaille
65bbaf9598 regen 2020-04-01 08:31:48 +02:00
Akim Demaille
9039c571f4 yacc.c: fix more errors from make maintainer-check-g++
* data/skeletons/yacc.c (yyexpected_tokens): Use casts where needed.
2020-04-01 08:31:48 +02:00
Akim Demaille
d3db22d788 regen 2020-04-01 08:31:48 +02:00
Akim Demaille
9434571f95 yacc.c: revert to not using yysymbol_type_t in the yytranslate table
This triggers warnings with several compilers.  For instance ICC fills
the logs with pages and pages of

    input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}"
             0,     2,     2,     2,     2,     2,     2,     2,     2,     2,
             ^

    input.c(477): error: a value of type "int" cannot be used to initialize an entity of type "const yysymbol_type_t={yysymbol_type_t}"
             0,     2,     2,     2,     2,     2,     2,     2,     2,     2,
                    ^

And so does G++9 when compiling yacc.c's (C) output

    input.c:545:8: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive]
      545 |        0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
          |        ^
          |        |
          |        int
    input.c:545:15: error: invalid conversion from 'int' to 'yysymbol_type_t' [-fpermissive]
      545 |        0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
          |               ^
          |               |
          |               int

Clang++ is no exception

    input.c:545:8: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int'
           0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
           ^
    input.c:545:15: error: cannot initialize an array element of type 'const yysymbol_type_t' with an rvalue of type 'int'
           0,     5,     9,     2,     2,     2,     2,     2,     2,     2,
                  ^

At some point we could use yysymbol_type_t's enumerators to define
yytranslate.  Meanwhile...

* data/skeletons/yacc.c (yytranslate): Use the original integral type
to define it.
(YYTRANSLATE): Cast the result into yysymbol_type_t.
2020-04-01 08:31:48 +02:00
Akim Demaille
0cdbcee0ce regen 2020-04-01 08:31:48 +02:00
Akim Demaille
fd37eb057e yysymbol_type_t: always assign an enumerator
Currently we define enumerators only for symbols that have an
identifier.  That rules out tokens such as '+', and nonterminals such
as foo-bar and foo.bar.  As a consequence we are taking chances: the
compiler might compile yysymbol_type_t as too small an integral type
for some symbol codes.

* data/skeletons/bison.m4 (b4_symbol_sid): Forge a unique symbol
identifier for symbols that don't have an ID.
2020-04-01 08:31:48 +02:00
Akim Demaille
ecc3a13c34 bistromathic: use symbol numbers instead of YYTRANSLATE
* examples/c/bistromathic/parse.y: here.
2020-04-01 08:31:48 +02:00
Akim Demaille
04904e4d28 regen 2020-04-01 08:31:48 +02:00
Akim Demaille
75a605454d yacc.c: prefer YYSYMBOL_YYERROR to YYSYMBOL_error
* data/skeletons/bison.m4 (b4_symbol_sid): Map "error" to YYSYMBOL_YYERROR.
* data/skeletons/yacc.c: Adjust.
2020-04-01 08:31:48 +02:00
Akim Demaille
d7f39ac507 regen 2020-04-01 08:31:48 +02:00
Akim Demaille
f3c18c8e80 yacc.c: also define a symbol number for the empty token
This is not only cleaner, it also protects us from mixing signed
values (YYEMPTY is #defined as -2) with unsigned types (the
yysymbol_type_t enum is typically compiled as a small unsigned).
For instance GCC 9:

    input.c: In function 'yyparse':
    input.c:1107:7: error: conversion to 'unsigned int' from 'int'
                           may change the sign of the result
                           [-Werror=sign-conversion]
     1107 |   yyn += yytoken;
          |       ^~
    input.c:1107:10: error: conversion to 'int' from 'unsigned int'
                            may change the sign of the result
                            [-Werror=sign-conversion]
     1107 |   yyn += yytoken;
          |          ^~~~~~~
    input.c:1108:47: error: comparison of integer expressions of
                            different signedness:
                            'yytype_int8' {aka 'const signed char'} and
                            'yysymbol_type_t' {aka 'enum yysymbol_type_t'}
                            [-Werror=sign-compare]
     1108 |   if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken)
          |                                               ^~
    input.c:702:25: error: operand of ?: changes signedness from 'int'
                           to 'unsigned int' due to unsignedness of
                           other operand [-Werror=sign-compare]
      702 | #define YYEMPTY         (-2)
          |                         ^~~~
    input.c:1220:33: note: in expansion of macro 'YYEMPTY'
     1220 |   yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar);
          |                                 ^~~~~~~
    input.c:1220:41: error: unsigned conversion from 'int' to
                            'unsigned int' changes value
                            from '-2' to '4294967294'
                            [-Werror=sign-conversion]
     1220 |   yytoken = yychar == YYEMPTY ? YYEMPTY : YYTRANSLATE (yychar);
          |                                         ^

Eventually, it might be interesting to move away from -2 (which is the
only possible negative symbol number) and use the next available
number, to save bits.  We could actually even simply use "0" and shift
the rest, which would allow to write "!yytoken" to mean really
"yytoken != YYEMPTY".

* data/skeletons/c.m4 (b4_declare_symbol_enum): Define YYSYMBOL_YYEMPTY.
* data/skeletons/yacc.c: Use it.

* src/parse-gram.y (yyreport_syntax_error): Use YYSYMBOL_YYEMPTY, not
YYEMPTY, when dealing with a symbol.

* tests/regression.at: Adjust.
2020-04-01 08:31:48 +02:00
Akim Demaille
00c80bc96c yacc.c: use yysymbol_type_t instead of int for yytoken
Now that we have a proper type for internal symbol numbers, let's use
it.  More code needs conversion, e.g., printers and destructors, but
they are shared with glr.c, which is not ready yet for this change.

It will also help us deal with warnings such as (GCC9 on GNU/Linux):

    input.c: In function 'int yyparse()':
    input.c:475:37: error: enumeral and non-enumeral type in conditional expression [-Werror=extra]
      475 |   (0 <= (YYX) && (YYX) <= YYMAXUTOK ? yytranslate[YYX] : YYSYMBOL_YYUNDEF)
          |    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    input.c:1024:17: note: in expansion of macro 'YYTRANSLATE'
     1024 |       yytoken = YYTRANSLATE (yychar);
          |                 ^~~~~~~~~~~

* data/skeletons/yacc.c (yytranslate, yysymbol_name)
(yyparse_context_t, yyexpected_tokens, yypstate_expected_tokens)
(yysyntax_error_arguments):
Use yysymbol_type_t instead of int.
2020-04-01 08:31:48 +02:00
Akim Demaille
f62f1db298 regen 2020-04-01 08:31:48 +02:00
Akim Demaille
3ba001baac yacc.c: introduce an enum that defines the symbol's number
There's a number of advantage in exposing the symbol (internal)
numbers:

- custom error messages can use them to decide how to represent a
  given symbol, or a set of symbols.

- we need something similar in uses of yyexpected_tokens.  For
  instance, currently, bistromathic's completion() reads:

    int ntokens = expected_tokens (line, tokens, YYNTOKENS);
    [...]
    for (int i = 0; i < ntokens; ++i)
      if (tokens[i] == YYTRANSLATE (TOK_VAR))
      [...]
      else if (tokens[i] == YYTRANSLATE (TOK_FUN))
      [...]
      else
      [...]

- now that it's a compile-time expression, we can easily build static
  tables, switch, etc.

- some users depended on the ability to get the token number from a
  symbol to write test cases for their scanners.  But Bison 3.5
  removed the table this feature depended upon (a reverse
  yytranslate).  Now they can check against the actual symbol number,
  without having pay (space and time) a conversion.
  See https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html, and
  https://lists.gnu.org/archive/html/bug-bison/2020-03/msg00015.html.

- it helps us clearly separate the internal symbol numbers from the
  external token numbers, whose difference is sometimes blurred in the
  code when values coincide (e.g. "yychar = yytoken = YYEOF").

- it allows us to get rid of ugly macros with inconsistent names such
  as YYUNDEFTOK and YYTERROR, and to group related definitions
  together.

- similarly it provides a clean access to the $accept symbol (which
  proves convenient in a current experimentation of mine with several
  %start symbols).

Let's declare this type as a private type (in the *.c file, not
the *.h one).  So it does not need to be influenced by the api prefix.

* data/skeletons/bison.m4 (b4_symbol_sid): New.
(b4_symbol): Use it.
* data/skeletons/c.m4 (b4_symbol_enum, b4_declare_symbol_enum): New.
* data/skeletons/yacc.c: Use b4_declare_symbol_enum.
(YYUNDEFTOK, YYTERROR): Remove.
Use the corresponding symbol enum instead.
2020-04-01 08:31:33 +02:00
Akim Demaille
4140320a0a style: comment changes about token numbers
* data/skeletons/bison.m4, data/skeletons/c.m4: here.
2020-03-30 08:41:12 +02:00
Akim Demaille
af19fd7e0f tests: recheck: work properly when the test suite was interrupted
* tests/local.mk (recheck): Look at the per-test logs, not the overall
log, which, when interrupted, contains only information about... the
tests that passed.
2020-03-30 08:41:12 +02:00
Akim Demaille
2c74872991 java: move away from _ for internationalization
The "_" is becoming a keyword in Java, which causes tons of warnings
currently in our test suite.  GNU Gettext is now using "i18n" instead
of "_"
(https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=e89fea36545f27487d9652a13e6a0adbea1117d0).

* data/skeletons/java.m4: Use "i18n", not "_".
* examples/java/calc/Calc.y, tests/calc.at: Adjust.
2020-03-30 08:03:10 +02:00
Akim Demaille
50517d578c regen 2020-03-28 15:13:27 +01:00
Akim Demaille
59d820d1ef c: use YYNOMEM instead of -2
See 84b1972c96.

* data/skeletons/glr.c, data/skeletons/yacc.c (YYNOMEM): New.
Use it.
2020-03-28 15:13:27 +01:00
Akim Demaille
90f0500ef8 todo: update
* TODO (Token Number): We have to clean this.
(Naming conventions, Symbol numbers): New.
(Bad styling): Addressed in e21ff47f5d.
2020-03-28 15:13:27 +01:00
Akim Demaille
17a9542c4f regen 2020-03-28 15:13:27 +01:00
Akim Demaille
b7045aa706 java: make yysyntaxErrorArguments a private detail
* data/skeletons/lalr1.java (yysyntaxErrorArguments): Move it from the
context, to the parser object.
Generate only for detailed and verbose error messages.
* tests/local.at (AT_YYERROR_DEFINE(java)): Use yyexpectedTokens
instead.
2020-03-28 15:13:27 +01:00
Akim Demaille
ee56b6e0f2 skeletons: make yysyntax_error_arguments a private detail
We could just "inline yysyntax_error_arguments back" in the routines
it was originally extracted from, but I think the code is nicer to
read this way.

* data/skeletons/glr.c (yysyntax_error_arguments): Generate only for
detailed and verbose error messages.
* data/skeletons/yacc.c: Likewise.
* data/skeletons/lalr1.cc (parser::context::yysyntax_error_arguments):
Move as...
(parser::yysyntax_error_arguments_): this.
And only for detailed and verbose error messages.
2020-03-28 15:13:27 +01:00
Akim Demaille
1edc98f793 lalr1.cc: avoid using yysyntax_error_arguments
* data/skeletons/lalr1.cc (context::token): New.
* tests/local.at (yyreport_syntax_error): Don't use
yysyntax_error_arguments.
2020-03-28 15:13:27 +01:00
Akim Demaille
4192de1f41 bison: avoid using yysyntax_error_arguments
* src/parse-gram.y (yyreport_syntax_error): Use yyparse_context_token
and yyexpected_tokens.
2020-03-28 15:13:27 +01:00
Akim Demaille
00b0d02955 tests: yacc.c: avoid yysyntax_error_arguments
Because glr.c shares the same testing routines, we also need to
convert it.

* data/skeletons/glr.c (yyparse_context_token): New.
* tests/local.at (yyreport_syntax_error): here.
2020-03-28 15:13:27 +01:00
Akim Demaille
1045c8d0ef examples: don't use yysyntax_error_arguments
Suggested by Adrian Vogelsgesang.
https://lists.gnu.org/archive/html/bison-patches/2020-02/msg00069.html

* data/skeletons/lalr1.java (Context.EMPTY, Context.getToken): New.
(Context.yyntokens): Rename as...
(Context.NTOKENS): this.
Because (i) all the Java coding styles recommend upper case for
constants, and (ii) the Java Skeleton exposes Lexer.EOF, not
Lexer.YYEOF.
* data/skeletons/yacc.c (yyparse_context_token): New.
* examples/c/bistromathic/parse.y (yyreport_syntax_error): Don't use
yysyntax_error_arguments.
* examples/java/calc/Calc.y (yyreportSyntaxError): Likewise.
2020-03-28 15:13:27 +01:00