style: use 'nonterminal' consistently

* doc/bison.texi: Formatting changes.
* src/gram.h, src/gram.c (nvars): Rename as...
(nnterms): this.
Adjust dependencies.
(section): New.  Use it.
Replace "non terminal" and "non-terminal" by "nonterminal".
This commit is contained in:
Akim Demaille
2020-06-27 11:12:48 +02:00
parent 4efb2f7bd2
commit 0895858d8e
22 changed files with 111 additions and 99 deletions

View File

@@ -37,6 +37,10 @@ Only user visible strings are to be translated: error messages, bits of the
assert/abort), and all the --trace output which is meant for the maintainers
only.
## Vocabulary
Use "nonterminal", not "variable" or "non-terminal" or "non terminal".
Abbreviated as "nterm".
## Syntax highlighting
It's quite nice to be in C++ mode when editing lalr1.cc for instance.
However tools such as Emacs will be fooled by the fact that braces and

2
TODO
View File

@@ -91,7 +91,7 @@ generates tons of white space in the page, and may contribute to bad page
breaks.
** consistency
token vs terminal, variable vs non terminal.
token vs terminal.
** api.token.raw
The YYUNDEFTOK could be assigned a semantic value so that yyerror could be

View File

@@ -2834,13 +2834,13 @@ predefined variables such as @code{pi} or @code{e} as well.
Add some new functions from @file{math.h} to the initialization list.
@item
Add another array that contains constants and their values. Then
modify @code{init_table} to add these constants to the symbol table.
It will be easiest to give the constants type @code{VAR}.
Add another array that contains constants and their values. Then modify
@code{init_table} to add these constants to the symbol table. It will be
easiest to give the constants type @code{VAR}.
@item
Make the program report an error if the user refers to an
uninitialized variable in any way except to store a value in it.
Make the program report an error if the user refers to an uninitialized
variable in any way except to store a value in it.
@end enumerate
@node Grammar File
@@ -5513,12 +5513,12 @@ do @{
yypstate_delete (ps);
@end example
If the user decided to use an impure push parser, a few things about
the generated parser will change. The @code{yychar} variable becomes
a global variable instead of a variable in the @code{yypush_parse} function.
For this reason, the signature of the @code{yypush_parse} function is
changed to remove the token as a parameter. A nonreentrant push parser
example would thus look like this:
If the user decided to use an impure push parser, a few things about the
generated parser will change. The @code{yychar} variable becomes a global
variable instead of a local one in the @code{yypush_parse} function. For
this reason, the signature of the @code{yypush_parse} function is changed to
remove the token as a parameter. A nonreentrant push parser example would
thus look like this:
@example
extern int yychar;
@@ -8104,10 +8104,9 @@ doing so would produce on the stack the sequence of symbols @code{expr
@vindex yychar
@vindex yylval
@vindex yylloc
The lookahead token is stored in the variable @code{yychar}.
Its semantic value and location, if any, are stored in the variables
@code{yylval} and @code{yylloc}.
@xref{Action Features}.
The lookahead token is stored in the variable @code{yychar}. Its semantic
value and location, if any, are stored in the variables @code{yylval} and
@code{yylloc}. @xref{Action Features}.
@node Shift/Reduce
@section Shift/Reduce Conflicts
@@ -14263,14 +14262,13 @@ start:
These tokens prevents the introduction of new conflicts. As far as the
parser goes, that is all that is needed.
Now the difficult part is ensuring that the scanner will send these
tokens first. If your scanner is hand-written, that should be
straightforward. If your scanner is generated by Lex, them there is
simple means to do it: recall that anything between @samp{%@{ ... %@}}
after the first @code{%%} is copied verbatim in the top of the generated
@code{yylex} function. Make sure a variable @code{start_token} is
available in the scanner (e.g., a global variable or using
@code{%lex-param} etc.), and use the following:
Now the difficult part is ensuring that the scanner will send these tokens
first. If your scanner is hand-written, that should be straightforward. If
your scanner is generated by Lex, them there is simple means to do it:
recall that anything between @samp{%@{ ... %@}} after the first @code{%%} is
copied verbatim in the top of the generated @code{yylex} function. Make
sure a variable @code{start_token} is available in the scanner (e.g., a
global variable or using @code{%lex-param} etc.), and use the following:
@example
/* @r{Prologue.} */

View File

@@ -104,21 +104,21 @@ print_fderives (void)
fprintf (stderr, "\n\n");
}
/*------------------------------------------------------------------.
| Set FIRSTS to be an NVARS array of NVARS bitsets indicating which |
| items can represent the beginning of the input corresponding to |
| which other items. |
/*-------------------------------------------------------------------.
| Set FIRSTS to be an NNTERMS array of NNTERMS bitsets indicating |
| which items can represent the beginning of the input corresponding |
| to which other items. |
| |
| For example, if some rule expands symbol 5 into the sequence of |
| symbols 8 3 20, the symbol 8 can be the beginning of the data for |
| symbol 5, so the bit [8 - ntokens] in first[5 - ntokens] (= FIRST |
| (5)) is set. |
`------------------------------------------------------------------*/
`-------------------------------------------------------------------*/
static void
set_firsts (void)
{
firsts = bitsetv_create (nvars, nvars, BITSET_FIXED);
firsts = bitsetv_create (nnterms, nnterms, BITSET_FIXED);
for (symbol_number i = ntokens; i < nsyms; ++i)
for (symbol_number j = 0; derives[i - ntokens][j]; ++j)
@@ -139,8 +139,8 @@ set_firsts (void)
}
/*-------------------------------------------------------------------.
| Set FDERIVES to an NVARS by NRULES matrix of bits indicating which |
| rules can help derive the beginning of the data for each |
| Set FDERIVES to an NNTERMS by NRULES matrix of bits indicating |
| which rules can help derive the beginning of the data for each |
| nonterminal. |
| |
| For example, if symbol 5 can be derived as the sequence of symbols |
@@ -151,7 +151,7 @@ set_firsts (void)
static void
set_fderives (void)
{
fderives = bitsetv_create (nvars, nrules, BITSET_FIXED);
fderives = bitsetv_create (nnterms, nrules, BITSET_FIXED);
set_firsts ();

View File

@@ -177,9 +177,9 @@ si_bfs_free (si_bfs_node *n)
/**
* start is a state_item such that conflict_sym is an element of FIRSTS of the
* non-terminal after the dot in start. Because of this, we should be able to
* nonterminal after the dot in start. Because of this, we should be able to
* find a production item starting with conflict_sym by only searching productions
* of the non-terminal and shifting over nullable non-terminals
* of the nonterminal and shifting over nullable nonterminals
*
* this returns the derivation of the productions that lead to conflict_sym
*/
@@ -292,7 +292,7 @@ complete_diverging_example (symbol_number conflict_sym,
// We go backwards through the path to create the derivation tree bottom-up.
// Effectively this loops through each production once, and generates a
// derivation of the left hand side by appending all of the rhs symbols.
// this becomes the derivation of the non-terminal after the dot in the
// this becomes the derivation of the nonterminal after the dot in the
// next production, and all of the other symbols of the rule are added as normal.
for (gl_list_node_t state_node = list_get_end (path);
state_node != NULL;
@@ -334,8 +334,8 @@ complete_diverging_example (symbol_number conflict_sym,
// Since reductions have the dot at the end of the item,
// this loop will be first executed on the last item in the path
// that's not a reduction. When that happens,
// the symbol after the dot should be a non-terminal,
// and we can look through successive nullable non-terminals
// the symbol after the dot should be a nonterminal,
// and we can look through successive nullable nonterminals
// for one with the conflict symbol in its first set.
if (bitset_test (FIRSTS (sym), conflict_sym))
{

View File

@@ -25,7 +25,7 @@
# include "gram.h"
/* Derivations are trees of symbols such that each non terminal's
/* Derivations are trees of symbols such that each nonterminal's
children are symbols that produce that nonterminal if they are
relevant to the counterexample. The leaves of a derivation form a
counterexample when printed. */

View File

@@ -62,7 +62,7 @@ derives_compute (void)
{
/* DSET[NTERM - NTOKENS] -- A linked list of the numbers of the rules
whose LHS is NTERM. */
rule_list **dset = xcalloc (nvars, sizeof *dset);
rule_list **dset = xcalloc (nnterms, sizeof *dset);
/* DELTS[RULE] -- There are NRULES rule number to attach to nterms.
Instead of performing NRULES allocations for each, have an array
@@ -82,9 +82,9 @@ derives_compute (void)
/* DSET contains what we need under the form of a linked list. Make
it a single array. */
derives = xnmalloc (nvars, sizeof *derives);
derives = xnmalloc (nnterms, sizeof *derives);
/* Q is the storage for DERIVES[...] (DERIVES[0] = q). */
rule **q = xnmalloc (nvars + nrules, sizeof *q);
rule **q = xnmalloc (nnterms + nrules, sizeof *q);
for (symbol_number i = ntokens; i < nsyms; ++i)
{

View File

@@ -40,7 +40,7 @@ rule_number nrules = 0;
symbol **symbols = NULL;
int nsyms = 0;
int ntokens = 1;
int nvars = 0;
int nnterms = 0;
symbol_number *token_translations = NULL;
@@ -192,10 +192,10 @@ grammar_rules_partial_print (FILE *out, const char *title,
if (first)
fprintf (out, "%s\n\n", title);
else if (previous_rule && previous_rule->lhs != rules[r].lhs)
fputc ('\n', out);
putc ('\n', out);
first = false;
rule_print (&rules[r], previous_rule, out);
fputc ('\n', out);
putc ('\n', out);
previous_rule = &rules[r];
}
if (!first)
@@ -241,15 +241,25 @@ grammar_rules_print_xml (FILE *out, int level)
xml_puts (out, level + 1, "<rules/>");
}
static void
section (FILE *out, const char *s)
{
fprintf (out, "%s\n", s);
for (int i = strlen (s); 0 < i; --i)
putc ('-', out);
putc ('\n', out);
putc ('\n', out);
}
void
grammar_dump (FILE *out, const char *title)
{
fprintf (out, "%s\n\n", title);
fprintf (out,
"ntokens = %d, nvars = %d, nsyms = %d, nrules = %d, nritems = %d\n\n",
ntokens, nvars, nsyms, nrules, nritems);
"ntokens = %d, nnterms = %d, nsyms = %d, nrules = %d, nritems = %d\n\n",
ntokens, nnterms, nsyms, nrules, nritems);
fprintf (out, "Tokens\n------\n\n");
section (out, "Tokens");
{
fprintf (out, "Value Sprec Sassoc Tag\n");
@@ -261,7 +271,7 @@ grammar_dump (FILE *out, const char *title)
fprintf (out, "\n\n");
}
fprintf (out, "Non terminals\n-------------\n\n");
section (out, "Nonterminals");
{
fprintf (out, "Value Tag\n");
@@ -271,7 +281,7 @@ grammar_dump (FILE *out, const char *title)
fprintf (out, "\n\n");
}
fprintf (out, "Rules\n-----\n\n");
section (out, "Rules");
{
fprintf (out,
"Num (Prec, Assoc, Useful, UselessChain) Lhs"
@@ -293,17 +303,17 @@ grammar_dump (FILE *out, const char *title)
/* Dumped the RHS. */
for (item_number *rhsp = rule_i->rhs; 0 <= *rhsp; ++rhsp)
fprintf (out, " %3d", *rhsp);
fputc ('\n', out);
putc ('\n', out);
}
}
fprintf (out, "\n\n");
fprintf (out, "Rules interpreted\n-----------------\n\n");
section (out, "Rules interpreted");
for (rule_number r = 0; r < nrules + nuseless_productions; ++r)
{
fprintf (out, "%-5d %s:", r, rules[r].lhs->symbol->tag);
rule_rhs_print (&rules[r], out);
fputc ('\n', out);
putc ('\n', out);
}
fprintf (out, "\n\n");
}

View File

@@ -23,9 +23,9 @@
/* Representation of the grammar rules:
NTOKENS is the number of tokens, and NVARS is the number of
NTOKENS is the number of tokens, and NNTERMS is the number of
variables (nonterminals). NSYMS is the total number, ntokens +
nvars.
nnterms.
Each symbol (either token or variable) receives a symbol number.
Numbers 0 to NTOKENS - 1 are for tokens, and NTOKENS to NSYMS - 1
@@ -113,7 +113,7 @@
extern int nsyms;
extern int ntokens;
extern int nvars;
extern int nnterms;
/* Elements of ritem. */
typedef int item_number;

View File

@@ -99,7 +99,7 @@ void
set_goto_map (void)
{
/* Count the number of gotos (ngotos) per nterm (goto_map). */
goto_map = xcalloc (nvars + 1, sizeof *goto_map);
goto_map = xcalloc (nnterms + 1, sizeof *goto_map);
ngotos = 0;
for (state_number s = 0; s < nstates; ++s)
{
@@ -113,7 +113,7 @@ set_goto_map (void)
}
}
goto_number *temp_map = xnmalloc (nvars + 1, sizeof *temp_map);
goto_number *temp_map = xnmalloc (nnterms + 1, sizeof *temp_map);
{
goto_number k = 0;
for (symbol_number i = ntokens; i < nsyms; ++i)
@@ -583,7 +583,7 @@ lalr_update_state_numbers (state_number old_to_new[], state_number nstates_old)
{
goto_number ngotos_reachable = 0;
symbol_number nonterminal = 0;
aver (nsyms == nvars + ntokens);
aver (nsyms == nnterms + ntokens);
for (goto_number i = 0; i < ngotos; ++i)
{
@@ -601,7 +601,7 @@ lalr_update_state_numbers (state_number old_to_new[], state_number nstates_old)
++ngotos_reachable;
}
}
while (nonterminal <= nvars)
while (nonterminal <= nnterms)
{
aver (ngotos == goto_map[nonterminal]);
goto_map[nonterminal++] = ngotos_reachable;

View File

@@ -86,7 +86,7 @@ state_list_append (symbol_number sym, size_t core_size, item_index *core)
return res;
}
/* Symbols that can be "shifted" (including non terminals) from the
/* Symbols that can be "shifted" (including nonterminals) from the
current state. */
bitset shift_symbol;

View File

@@ -186,7 +186,7 @@ shortest_path_from_start (state_item_number target, symbol_number next_sym)
}
}
// For production steps, follow_L is based on the symbol after the
// non-terminal being produced.
// nonterminal being produced.
// if no such symbol exists, follow_L is unchanged
// if the symbol is a terminal, follow_L only contains that terminal
// if the symbol is not nullable, follow_L is its FIRSTS set

View File

@@ -54,17 +54,17 @@ nullable_print (FILE *out)
void
nullable_compute (void)
{
nullable = xcalloc (nvars, sizeof *nullable);
nullable = xcalloc (nnterms, sizeof *nullable);
size_t *rcount = xcalloc (nrules, sizeof *rcount);
/* RITEM contains all the rules, including useless productions.
Hence we must allocate room for useless nonterminals too. */
rule_list **rsets = xcalloc (nvars, sizeof *rsets);
rule_list **rsets = xcalloc (nnterms, sizeof *rsets);
/* This is said to be more elements than we actually use.
Supposedly NRITEMS - NRULES is enough. But why take the risk? */
rule_list *relts = xnmalloc (nritems + nvars + 1, sizeof *relts);
rule_list *relts = xnmalloc (nritems + nnterms + 1, sizeof *relts);
symbol_number *squeue = xnmalloc (nvars, sizeof *squeue);
symbol_number *squeue = xnmalloc (nnterms, sizeof *squeue);
symbol_number *s2 = squeue;
{
rule_list *p = relts;

View File

@@ -277,7 +277,7 @@ static void
prepare_symbols (void)
{
MUSCLE_INSERT_INT ("tokens_number", ntokens);
MUSCLE_INSERT_INT ("nterms_number", nvars);
MUSCLE_INSERT_INT ("nterms_number", nnterms);
MUSCLE_INSERT_INT ("symbols_number", nsyms);
MUSCLE_INSERT_INT ("code_max", max_code);

View File

@@ -125,13 +125,13 @@ void parse_state_lists (parse_state *ps, gl_list_t *state_items,
* is appended to state-items. */
parse_state_list simulate_transition (parse_state *ps);
/* Look at all of the productions for the non-terminal following the dot in the tail
/* Look at all of the productions for the nonterminal following the dot in the tail
* state-item. Appends to state-items each production state-item which may start with
* compat_sym. */
parse_state_list simulate_production (parse_state *ps, symbol_number compat_sym);
/* Removes the last rule_len state-items along with their derivations. A new state-item is
* appended representing the goto after the reduction. A derivation for the non-terminal that
* appended representing the goto after the reduction. A derivation for the nonterminal that
* was just reduced is appended which consists of the list of derivations that were just removed. */
parse_state_list simulate_reduction (parse_state *ps, int rule_len,
bitset symbol_set);

View File

@@ -818,7 +818,7 @@ check_and_convert_grammar (void)
}
aver (nsyms <= SYMBOL_NUMBER_MAXIMUM);
aver (nsyms == ntokens + nvars);
aver (nsyms == ntokens + nnterms);
/* Assign the symbols their symbol numbers. */
symbols_pack ();

View File

@@ -93,7 +93,7 @@ useless_nonterminals (void)
/* N is set as built. Np is set being built this iteration. P is
set of all productions which have a RHS all in N. */
bitset Np = bitset_create (nvars, BITSET_FIXED);
bitset Np = bitset_create (nnterms, BITSET_FIXED);
/* The set being computed is a set of nonterminals which can derive
the empty string or strings consisting of all terminals. At each
@@ -201,7 +201,7 @@ inaccessable_symbols (void)
int nuseful_nonterminals = 0;
for (symbol_number i = ntokens; i < nsyms; ++i)
nuseful_nonterminals += bitset_test (V, i);
nuseless_nonterminals = nvars - nuseful_nonterminals;
nuseless_nonterminals = nnterms - nuseful_nonterminals;
/* A token that was used in %prec should not be warned about. */
for (rule_number r = 0; r < nrules; ++r)
@@ -263,7 +263,7 @@ symbol_number *nterm_map = NULL;
static void
nonterminals_reduce (void)
{
nterm_map = xnmalloc (nvars, sizeof *nterm_map);
nterm_map = xnmalloc (nnterms, sizeof *nterm_map);
/* Map the nonterminals to their new index: useful first, useless
afterwards. Kept for later report. */
{
@@ -284,7 +284,7 @@ nonterminals_reduce (void)
/* Shuffle elements of tables indexed by symbol number. */
{
symbol **symbols_sorted = xnmalloc (nvars, sizeof *symbols_sorted);
symbol **symbols_sorted = xnmalloc (nnterms, sizeof *symbols_sorted);
for (symbol_number i = ntokens; i < nsyms; ++i)
symbols[i]->content->number = nterm_map[i - ntokens];
for (symbol_number i = ntokens; i < nsyms; ++i)
@@ -305,7 +305,7 @@ nonterminals_reduce (void)
}
nsyms -= nuseless_nonterminals;
nvars -= nuseless_nonterminals;
nnterms -= nuseless_nonterminals;
}
@@ -368,7 +368,7 @@ reduce_grammar (void)
{
/* Allocate the global sets used to compute the reduced grammar */
N = bitset_create (nvars, BITSET_FIXED);
N = bitset_create (nnterms, BITSET_FIXED);
P = bitset_create (nrules, BITSET_FIXED);
V = bitset_create (nsyms, BITSET_FIXED);
V1 = bitset_create (nsyms, BITSET_FIXED);
@@ -401,7 +401,7 @@ reduce_grammar (void)
fprintf (stderr, "reduced %s defines %d terminals, %d nonterminals"
", and %d productions.\n",
grammar_file, ntokens, nvars, nrules);
grammar_file, ntokens, nnterms, nrules);
}
}

View File

@@ -33,7 +33,7 @@ bool reduce_nonterminal_useless_in_grammar (const sym_content *sym);
void reduce_free (void);
/** Map initial nterm numbers to the new ones. Built by
* reduce_grammar. Size nvars + nuseless_nonterminals. */
* reduce_grammar. Size nnterms + nuseless_nonterminals. */
extern symbol_number *nterm_map;
extern int nuseless_nonterminals;

View File

@@ -336,7 +336,7 @@ bitsetv firsts = NULL;
static void
init_firsts (void)
{
firsts = bitsetv_create (nvars, nsyms, BITSET_FIXED);
firsts = bitsetv_create (nnterms, nsyms, BITSET_FIXED);
for (rule_number i = 0; i < nrules; ++i)
{
rule *r = rules + i;

View File

@@ -549,7 +549,7 @@ symbol_class_set (symbol *sym, symbol_class class, location loc, bool declaring)
complain_pct_type_on_token (&sym->location);
if (class == nterm_sym && s->class != nterm_sym)
s->number = nvars++;
s->number = nnterms++;
else if (class == token_sym && s->number == NUMBER_UNDEFINED)
s->number = ntokens++;
s->class = class;
@@ -621,7 +621,7 @@ symbol_check_defined (symbol *sym)
{
complain_symbol_undeclared (sym);
s->class = nterm_sym;
s->number = nvars++;
s->number = nnterms++;
}
if (s->class == token_sym
@@ -852,7 +852,7 @@ symbols_new (void)
/* Construct the accept symbol. */
accept = symbol_get ("$accept", empty_loc);
accept->content->class = nterm_sym;
accept->content->number = nvars++;
accept->content->number = nnterms++;
/* Construct the YYerror/"error" token */
errtoken = symbol_get ("YYerror", empty_loc);
@@ -969,7 +969,7 @@ dummy_symbol_get (location loc)
assure (len < sizeof buf);
symbol *sym = symbol_get (buf, loc);
sym->content->class = nterm_sym;
sym->content->number = nvars++;
sym->content->number = nnterms++;
return sym;
}

View File

@@ -547,7 +547,7 @@ static void
goto_actions (void)
{
size_t *state_count = xnmalloc (nstates, sizeof *state_count);
yydefgoto = xnmalloc (nvars, sizeof *yydefgoto);
yydefgoto = xnmalloc (nnterms, sizeof *yydefgoto);
/* For a given nterm I, STATE_COUNT[S] is the number of times there
is a GOTO to S on I. */
@@ -780,9 +780,9 @@ tables_generate (void)
correlated. In particular the signedness is not taken into
account. But it's not useless. */
verify (sizeof nstates <= sizeof nvectors);
verify (sizeof nvars <= sizeof nvectors);
verify (sizeof nnterms <= sizeof nvectors);
nvectors = state_number_as_int (nstates) + nvars;
nvectors = state_number_as_int (nstates) + nnterms;
froms = xcalloc (nvectors, sizeof *froms);
tos = xcalloc (nvectors, sizeof *tos);

View File

@@ -328,7 +328,7 @@ input.y: warning: 1 rule useless in grammar [-Wother]
input.y:4.1-7: warning: nonterminal useless in grammar: useless [-Wother]
Reduced Grammar
ntokens = 7, nvars = 4, nsyms = 11, nrules = 6, nritems = 17
ntokens = 7, nnterms = 4, nsyms = 11, nrules = 6, nritems = 17
Tokens
------
@@ -343,8 +343,8 @@ Value Sprec Sassoc Tag
6 0 0 "num"
Non terminals
-------------
Nonterminals
------------
Value Tag
7 $accept
@@ -411,7 +411,7 @@ exp:
AT_BISON_CHECK([[--trace=grammar -o input.c input.y]], [], [],
[[Reduced Grammar
ntokens = 10, nvars = 2, nsyms = 12, nrules = 8, nritems = 29
ntokens = 10, nnterms = 2, nsyms = 12, nrules = 8, nritems = 29
Tokens
------
@@ -429,8 +429,8 @@ Value Sprec Sassoc Tag
9 0 0 "exp"
Non terminals
-------------
Nonterminals
------------
Value Tag
10 $accept