Merge branch maint

* maint:
  maint: post-release administrivia
  version 3.3.2
  style: minor fixes
  NEWS: named constructors are preferable to symbol_type ctors
  gram: fix handling of nterms in actions when some are unused
  style: rename local variable
  CI: update the ICC serial number for travis-ci.org
This commit is contained in:
Akim Demaille
2019-02-03 15:23:54 +01:00
10 changed files with 198 additions and 52 deletions

View File

@@ -1 +1 @@
3.3.1
3.3.2

30
NEWS
View File

@@ -9,6 +9,13 @@ GNU Bison NEWS
When given -fsyntax-only, the diagnostics are reported, but no output is
generated.
* Noteworthy changes in release 3.3.2 (2019-02-03) [stable]
** Bug fixes
Bison 3.3 failed to generate parsers for grammars with unused nonterminal
symbols.
* Noteworthy changes in release 3.3.1 (2019-01-27) [stable]
** Changes
@@ -225,17 +232,18 @@ GNU Bison NEWS
symbol_type (int token, const int&);
symbol_type (int token);
which should be used in a Flex-scanner as follows.
%%
[a-z]+ return yy::parser::symbol_type (ID, yytext);
[0-9]+ return yy::parser::symbol_type (INT, text_to_int (yytext);
":" return yy::parser::symbol_type (:);
<<EOF>> return yy::parser::symbol_type (0);
Correct matching between token types and value types is checked via
'assert'. For instance, 'symbol_type (ID, 42)' would abort (while
'make_ID (42)' would not even compile).
'assert'; for instance, 'symbol_type (ID, 42)' would abort. Named
constructors are preferable, as they offer better type safety (for
instance 'make_ID (42)' would not even compile), but symbol_type
constructors may help when token types are discovered at run-time, e.g.,
[a-z]+ {
if (auto i = lookup_keyword (yytext))
return yy::parser::symbol_type (i);
else
return yy::parser::make_ID (yytext);
}
*** C++: Variadic emplace
@@ -3488,7 +3496,7 @@ along with this program. If not, see <http://www.gnu.org/licenses/>.
LocalWords: Heimbigner AST src ast Makefile srcdir MinGW xxlex XXSTYPE
LocalWords: XXLTYPE strictfp IDEs ffixit fdiagnostics parseable fixits
LocalWords: Wdeprecated yytext Variadic variadic yyrhs yyphrs RCS README
LocalWords: noexcept constexpr ispell american
LocalWords: noexcept constexpr ispell american deprecations
Local Variables:
ispell-dictionary: "american"

1
THANKS
View File

@@ -18,6 +18,7 @@ Antonio Silva Correia amsilvacorreia@hotmail.com
Arnold Robbins arnold@skeeve.com
Art Haas ahaas@neosoft.com
Askar Safin safinaskar@mail.ru
Balázs Scheidler balazs.scheidler@oneidentity.com
Baron Schwartz baron@sequent.org
Ben Pfaff blp@cs.stanford.edu
Benoit Perrot benoit.perrot@epita.fr

View File

@@ -75,48 +75,75 @@ skeletons.
## Symbols
### `b4_symbol(NUM, FIELD)`
In order to unify the handling of the various aspects of symbols (tag, type
name, whether terminal, etc.), bison.exe defines one macro per (token,
field), where field can `has_id`, `id`, etc.: see
src/output.c:prepare_symbols_definitions().
`prepare_symbols_definitions()` in `src/output.c`.
The various FIELDS are:
The macro `b4_symbol(NUM, FIELD)` gives access to the following FIELDS:
- `has_id`: 0 or 1.
- has_id: 0 or 1.
Whether the symbol has an id.
- id: string
If has_id, the id. Guaranteed to be usable as a C identifier.
Prefixed by api.token.prefix if defined.
- tag: string.
- `id`: string
If has_id, the id (prefixed by api.token.prefix if defined), otherwise
defined as empty. Guaranteed to be usable as a C identifier.
- `tag`: string.
A representation of the symbol. Can be 'foo', 'foo.id', '"foo"' etc.
- user_number: integer
- `user_number`: integer
The external number as used by yylex. Can be ASCII code when a character,
some number chosen by bison, or some user number in the case of
%token FOO <NUM>. Corresponds to yychar in yacc.c.
- is_token: 0 or 1
- `is_token`: 0 or 1
Whether this is a terminal symbol.
- number: integer
- `number`: integer
The internal number (computed from the external number by yytranslate).
Corresponds to yytoken in yacc.c. This is the same number that serves as
key in b4_symbol(NUM, FIELD).
- has_type: 0, 1
In bison, symbols are first assigned increasing numbers in order of
appearance (but tokens first, then nterms). After grammar reduction,
unused nterms are then renumbered to appear last (i.e., first tokens, then
used nterms and finally unused nterms). This final number NUM is the one
contained in this field, and it is the one used as key in `b4_symbol(NUM,
FIELD)`.
The code of the rule actions, however, is emitted before we know what
symbols are unused, so they use the original numbers. To avoid confusion,
they actually use "orig NUM" instead of just "NUM". bison also emits
definitions for `b4_symbol(orig NUM, number)` that map from original
numbers to the new ones. `b4_symbol` actually resolves `orig NUM` in the
other case, i.e., `b4_symbol(orig 42, tag)` would return the tag of the
symbols whose original number was 42.
- `has_type`: 0, 1
Whether has a semantic value.
- type_tag: string
- `type_tag`: string
When api.value.type=union, the generated name for the union member.
yytype_INT etc. for symbols that has_id, otherwise yytype_1 etc.
- type
- `type`
If it has a semantic value, its type tag, or, if variant are used,
its type.
In the case of api.value.type=union, type is the real type (e.g. int).
- has_printer: 0, 1
- printer: string
- printer_file: string
- printer_line: integer
- `has_printer`: 0, 1
- `printer`: string
- `printer_file`: string
- `printer_line`: integer
If the symbol has a printer, everything about it.
- has_destructor, destructor, destructor_file, destructor_line
- `has_destructor`, `destructor`, `destructor_file`, `destructor_line`
Likewise.
### b4_symbol_value(VAL, [SYMBOL-NUM], [TYPE-TAG])
### `b4_symbol_value(VAL, [SYMBOL-NUM], [TYPE-TAG])`
Expansion of $$, $1, $<TYPE-TAG>3, etc.
The semantic value from a given VAL.
@@ -127,14 +154,14 @@ The semantic value from a given VAL.
The result can be used safely, it is put in parens to avoid nasty precedence
issues.
### b4_lhs_value(SYMBOL-NUM, [TYPE])
### `b4_lhs_value(SYMBOL-NUM, [TYPE])`
Expansion of `$$` or `$<TYPE>$`, for symbol `SYMBOL-NUM`.
### b4_rhs_data(RULE-LENGTH, POS)
### `b4_rhs_data(RULE-LENGTH, POS)`
The data corresponding to the symbol `#POS`, where the current rule has
`RULE-LENGTH` symbols on RHS.
### b4_rhs_value(RULE-LENGTH, POS, SYMBOL-NUM, [TYPE])
### `b4_rhs_value(RULE-LENGTH, POS, SYMBOL-NUM, [TYPE])`
Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols
on RHS.

View File

@@ -389,17 +389,28 @@ m4_define([b4_glr_cc_if],
#
# The following macros provide access to symbol related values.
# _b4_symbol(NUM, FIELD)
# ----------------------
# __b4_symbol(NUM, FIELD)
# -----------------------
# Recover a FIELD about symbol #NUM. Thanks to m4_indir, fails if
# undefined.
m4_define([_b4_symbol],
m4_define([__b4_symbol],
[m4_indir([b4_symbol($1, $2)])])
# _b4_symbol(NUM, FIELD)
# ----------------------
# Recover a FIELD about symbol #NUM (or "orig NUM"). Fails if
# undefined.
m4_define([_b4_symbol],
[m4_ifdef([b4_symbol($1, number)],
[__b4_symbol(m4_indir([b4_symbol($1, number)]), $2)],
[__b4_symbol([$1], [$2])])])
# b4_symbol(NUM, FIELD)
# ---------------------
# Recover a FIELD about symbol #NUM. Thanks to m4_indir, fails if
# Recover a FIELD about symbol #NUM (or "orig NUM"). Fails if
# undefined. If FIELD = id, prepend the token prefix.
m4_define([b4_symbol],
[m4_case([$2],

View File

@@ -38,6 +38,7 @@
#include "muscle-tab.h"
#include "output.h"
#include "reader.h"
#include "reduce.h"
#include "scan-code.h" /* max_left_semantic_context */
#include "scan-skel.h"
#include "symtab.h"
@@ -413,6 +414,14 @@ merger_output (FILE *out)
static void
prepare_symbol_definitions (void)
{
/* Map "orig NUM" to new numbers. See data/README. */
for (symbol_number i = ntokens; i < nsyms + nuseless_nonterminals; ++i)
{
obstack_printf (&format_obstack, "symbol(orig %d, number)", i);
const char *key = obstack_finish0 (&format_obstack);
MUSCLE_INSERT_INT (key, nterm_map ? nterm_map[i - ntokens] : i);
}
for (int i = 0; i < nsyms; ++i)
{
symbol *sym = symbols[i];

View File

@@ -258,22 +258,23 @@ reduce_grammar_tables (void)
| Remove useless nonterminals. |
`------------------------------*/
symbol_number *nterm_map = NULL;
static void
nonterminals_reduce (void)
{
nterm_map = xnmalloc (nvars, sizeof *nterm_map);
/* Map the nonterminals to their new index: useful first, useless
afterwards. Kept for later report. */
symbol_number *nontermmap = xnmalloc (nvars, sizeof *nontermmap);
{
symbol_number n = ntokens;
for (symbol_number i = ntokens; i < nsyms; ++i)
if (bitset_test (V, i))
nontermmap[i - ntokens] = n++;
nterm_map[i - ntokens] = n++;
for (symbol_number i = ntokens; i < nsyms; ++i)
if (!bitset_test (V, i))
{
nontermmap[i - ntokens] = n++;
nterm_map[i - ntokens] = n++;
if (symbols[i]->content->status != used)
complain (&symbols[i]->location, Wother,
_("nonterminal useless in grammar: %s"),
@@ -281,32 +282,30 @@ nonterminals_reduce (void)
}
}
/* Shuffle elements of tables indexed by symbol number. */
{
symbol **symbols_sorted = xnmalloc (nvars, sizeof *symbols_sorted);
for (symbol_number i = ntokens; i < nsyms; ++i)
symbols[i]->content->number = nontermmap[i - ntokens];
symbols[i]->content->number = nterm_map[i - ntokens];
for (symbol_number i = ntokens; i < nsyms; ++i)
symbols_sorted[nontermmap[i - ntokens] - ntokens] = symbols[i];
symbols_sorted[nterm_map[i - ntokens] - ntokens] = symbols[i];
for (symbol_number i = ntokens; i < nsyms; ++i)
symbols[i] = symbols_sorted[i - ntokens];
free (symbols_sorted);
}
/* Update nonterminal numbers in the RHS of the rules. LHS are
pointers to the symbol structure, they don't need renumbering. */
{
for (rule_number r = 0; r < nrules; ++r)
for (item_number *rhsp = rules[r].rhs; 0 <= *rhsp; ++rhsp)
if (ISVAR (*rhsp))
*rhsp = symbol_number_as_item_number (nontermmap[*rhsp
- ntokens]);
accept->content->number = nontermmap[accept->content->number - ntokens];
*rhsp = symbol_number_as_item_number (nterm_map[*rhsp - ntokens]);
accept->content->number = nterm_map[accept->content->number - ntokens];
}
nsyms -= nuseless_nonterminals;
nvars -= nuseless_nonterminals;
free (nontermmap);
}
@@ -432,4 +431,6 @@ reduce_free (void)
bitset_free (V);
bitset_free (V1);
bitset_free (P);
free (nterm_map);
nterm_map = NULL;
}

View File

@@ -32,6 +32,11 @@ bool reduce_nonterminal_useless_in_grammar (const sym_content *sym);
void reduce_free (void);
/** Map initial nterm numbers to the new ones. Built by
* reduce_grammar. Size nvars + nuseless_nonterminals. */
extern symbol_number *nterm_map;
extern unsigned nuseless_nonterminals;
extern unsigned nuseless_productions;
#endif /* !REDUCE_H_ */

View File

@@ -648,7 +648,7 @@ handle_action_dollar (symbol_list *rule, char *text, location dollar_loc)
untyped_var_seen = true;
}
obstack_printf (&obstack_for_string, "]b4_lhs_value(%d, ",
obstack_printf (&obstack_for_string, "]b4_lhs_value(orig %d, ",
sym->content.sym->content->number);
obstack_quote (&obstack_for_string, type_name);
obstack_sgrow (&obstack_for_string, ")[");
@@ -677,7 +677,9 @@ handle_action_dollar (symbol_list *rule, char *text, location dollar_loc)
"]b4_rhs_value(%d, %d, ",
effective_rule_length, n);
if (sym)
obstack_printf (&obstack_for_string, "%d, ", sym->content.sym->content->number);
obstack_printf (&obstack_for_string, "%s%d, ",
sym->content.sym->content->class == nterm_sym ? "orig " : "",
sym->content.sym->content->number);
else
obstack_sgrow (&obstack_for_string, "[], ");

View File

@@ -217,6 +217,88 @@ AT_CLEANUP
## --------------- ##
## Useless Parts. ##
## --------------- ##
AT_SETUP([Useless Parts])
# We used to emit code that used symbol numbers before the useless
# symbol elimination, hence before the renumbering of the useful
# symbols. As a result, the evaluation of the skeleton failed because
# it used non existing symbol numbers. Which is the happy scenario:
# we could use numbers of other existing symbols...
# http://lists.gnu.org/archive/html/bug-bison/2019-01/msg00044.html
AT_BISON_OPTION_PUSHDEFS
AT_DATA([[input.y]],
[[%code {
]AT_YYERROR_DECLARE_EXTERN[
]AT_YYLEX_DECLARE_EXTERN[
}
%union { void* ptr; }
%type <ptr> used1
%type <ptr> used2
%%
start
: used1
;
used1
: used2 { $$ = $1; }
;
unused
: used2
;
used2
: { $$ = YY_NULLPTR; }
;
]])
AT_BISON_CHECK([[-fcaret -rall -o input.c input.y]], 0, [],
[[input.y: warning: 1 nonterminal useless in grammar [-Wother]
input.y: warning: 1 rule useless in grammar [-Wother]
input.y:18.1-6: warning: nonterminal useless in grammar: unused [-Wother]
unused
^~~~~~
]])
AT_CHECK([[sed -n '/^State 0/q;/^$/!p' input.output]], 0,
[[Nonterminals useless in grammar
unused
Rules useless in grammar
4 unused: used2
Grammar
0 $accept: start $end
1 start: used1
2 used1: used2
3 used2: %empty
Terminals, with rules where they appear
$end (0) 0
error (256)
Nonterminals, with rules where they appear
$accept (3)
on left: 0
start (4)
on left: 1, on right: 0
used1 <ptr> (5)
on left: 2, on right: 1
used2 <ptr> (6)
on left: 3, on right: 2
]])
# Make sure the generated parser is correct.
AT_COMPILE([input.o])
AT_BISON_OPTION_POPDEFS
AT_CLEANUP
## ------------------- ##
## Reduced Automaton. ##
## ------------------- ##