mirror of
https://git.savannah.gnu.org/git/bison.git
synced 2026-03-10 12:53:03 +00:00
able to understand that `--out' is OK: the two racing long options are aliases. (usage): Adjust. * src/lex.h (tok_setopt): Remove, replaced with... (tok_intopt, tok_stropt): these new guys. * src/lex.c (getopt.h): Not needed. (token_buffer, unlexed_token_buffer): Not const. (percent_table): Promote `-' over `_' in directive names. Active `%name-prefix', `file-prefix', and `output'. (parse_percent_token): Accept possible arguments to directives. Promote `-' over `_' in directive names. * doc/bison.texinfo (Decl Summary): Split the list into `directives for grammars' and `directives for bison'. Sort'em. Add description of `%name-prefix', `file-prefix', and `output'. Promote `-' over `_' in directive names. (Bison Options): s/%locactions/%locations/. Nice Freudian slip. Simplify the description of `--name-prefix'. Promote `-' over `_' in directive names. Promote `--output' over `--output-file'. Fix the description of `--defines'. * tests/output.at: Exercise %file-prefix and %output.
1319 lines
50 KiB
Plaintext
1319 lines
50 KiB
Plaintext
Ceci est le fichier Info bison.info, produit par Makeinfo version 4.0b
|
||
à partir bison.texinfo.
|
||
|
||
START-INFO-DIR-ENTRY
|
||
* bison: (bison). GNU Project parser generator (yacc replacement).
|
||
END-INFO-DIR-ENTRY
|
||
|
||
This file documents the Bison parser generator.
|
||
|
||
Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999,
|
||
2000, 2001 Free Software Foundation, Inc.
|
||
|
||
Permission is granted to make and distribute verbatim copies of this
|
||
manual provided the copyright notice and this permission notice are
|
||
preserved on all copies.
|
||
|
||
Permission is granted to copy and distribute modified versions of
|
||
this manual under the conditions for verbatim copying, provided also
|
||
that the sections entitled "GNU General Public License" and "Conditions
|
||
for Using Bison" are included exactly as in the original, and provided
|
||
that the entire resulting derived work is distributed under the terms
|
||
of a permission notice identical to this one.
|
||
|
||
Permission is granted to copy and distribute translations of this
|
||
manual into another language, under the above conditions for modified
|
||
versions, except that the sections entitled "GNU General Public
|
||
License", "Conditions for Using Bison" and this permission notice may be
|
||
included in translations approved by the Free Software Foundation
|
||
instead of in the original English.
|
||
|
||
|
||
File: bison.info, Node: Action Features, Prev: Error Reporting, Up: Interface
|
||
|
||
Special Features for Use in Actions
|
||
===================================
|
||
|
||
Here is a table of Bison constructs, variables and macros that are
|
||
useful in actions.
|
||
|
||
`$$'
|
||
Acts like a variable that contains the semantic value for the
|
||
grouping made by the current rule. *Note Actions::.
|
||
|
||
`$N'
|
||
Acts like a variable that contains the semantic value for the Nth
|
||
component of the current rule. *Note Actions::.
|
||
|
||
`$<TYPEALT>$'
|
||
Like `$$' but specifies alternative TYPEALT in the union specified
|
||
by the `%union' declaration. *Note Data Types of Values in
|
||
Actions: Action Types.
|
||
|
||
`$<TYPEALT>N'
|
||
Like `$N' but specifies alternative TYPEALT in the union specified
|
||
by the `%union' declaration. *Note Data Types of Values in
|
||
Actions: Action Types.
|
||
|
||
`YYABORT;'
|
||
Return immediately from `yyparse', indicating failure. *Note The
|
||
Parser Function `yyparse': Parser Function.
|
||
|
||
`YYACCEPT;'
|
||
Return immediately from `yyparse', indicating success. *Note The
|
||
Parser Function `yyparse': Parser Function.
|
||
|
||
`YYBACKUP (TOKEN, VALUE);'
|
||
Unshift a token. This macro is allowed only for rules that reduce
|
||
a single value, and only when there is no look-ahead token. It
|
||
installs a look-ahead token with token type TOKEN and semantic
|
||
value VALUE; then it discards the value that was going to be
|
||
reduced by this rule.
|
||
|
||
If the macro is used when it is not valid, such as when there is a
|
||
look-ahead token already, then it reports a syntax error with a
|
||
message `cannot back up' and performs ordinary error recovery.
|
||
|
||
In either case, the rest of the action is not executed.
|
||
|
||
`YYEMPTY'
|
||
Value stored in `yychar' when there is no look-ahead token.
|
||
|
||
`YYERROR;'
|
||
Cause an immediate syntax error. This statement initiates error
|
||
recovery just as if the parser itself had detected an error;
|
||
however, it does not call `yyerror', and does not print any
|
||
message. If you want to print an error message, call `yyerror'
|
||
explicitly before the `YYERROR;' statement. *Note Error
|
||
Recovery::.
|
||
|
||
`YYRECOVERING'
|
||
This macro stands for an expression that has the value 1 when the
|
||
parser is recovering from a syntax error, and 0 the rest of the
|
||
time. *Note Error Recovery::.
|
||
|
||
`yychar'
|
||
Variable containing the current look-ahead token. (In a pure
|
||
parser, this is actually a local variable within `yyparse'.) When
|
||
there is no look-ahead token, the value `YYEMPTY' is stored in the
|
||
variable. *Note Look-Ahead Tokens: Look-Ahead.
|
||
|
||
`yyclearin;'
|
||
Discard the current look-ahead token. This is useful primarily in
|
||
error rules. *Note Error Recovery::.
|
||
|
||
`yyerrok;'
|
||
Resume generating error messages immediately for subsequent syntax
|
||
errors. This is useful primarily in error rules. *Note Error
|
||
Recovery::.
|
||
|
||
`@$'
|
||
Acts like a structure variable containing information on the
|
||
textual position of the grouping made by the current rule. *Note
|
||
Tracking Locations: Locations.
|
||
|
||
`@N'
|
||
Acts like a structure variable containing information on the
|
||
textual position of the Nth component of the current rule. *Note
|
||
Tracking Locations: Locations.
|
||
|
||
|
||
File: bison.info, Node: Algorithm, Next: Error Recovery, Prev: Interface, Up: Top
|
||
|
||
The Bison Parser Algorithm
|
||
**************************
|
||
|
||
As Bison reads tokens, it pushes them onto a stack along with their
|
||
semantic values. The stack is called the "parser stack". Pushing a
|
||
token is traditionally called "shifting".
|
||
|
||
For example, suppose the infix calculator has read `1 + 5 *', with a
|
||
`3' to come. The stack will have four elements, one for each token
|
||
that was shifted.
|
||
|
||
But the stack does not always have an element for each token read.
|
||
When the last N tokens and groupings shifted match the components of a
|
||
grammar rule, they can be combined according to that rule. This is
|
||
called "reduction". Those tokens and groupings are replaced on the
|
||
stack by a single grouping whose symbol is the result (left hand side)
|
||
of that rule. Running the rule's action is part of the process of
|
||
reduction, because this is what computes the semantic value of the
|
||
resulting grouping.
|
||
|
||
For example, if the infix calculator's parser stack contains this:
|
||
|
||
1 + 5 * 3
|
||
|
||
and the next input token is a newline character, then the last three
|
||
elements can be reduced to 15 via the rule:
|
||
|
||
expr: expr '*' expr;
|
||
|
||
Then the stack contains just these three elements:
|
||
|
||
1 + 15
|
||
|
||
At this point, another reduction can be made, resulting in the single
|
||
value 16. Then the newline token can be shifted.
|
||
|
||
The parser tries, by shifts and reductions, to reduce the entire
|
||
input down to a single grouping whose symbol is the grammar's
|
||
start-symbol (*note Languages and Context-Free Grammars: Language and
|
||
Grammar.).
|
||
|
||
This kind of parser is known in the literature as a bottom-up parser.
|
||
|
||
* Menu:
|
||
|
||
* Look-Ahead:: Parser looks one token ahead when deciding what to do.
|
||
* Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
|
||
* Precedence:: Operator precedence works by resolving conflicts.
|
||
* Contextual Precedence:: When an operator's precedence depends on context.
|
||
* Parser States:: The parser is a finite-state-machine with stack.
|
||
* Reduce/Reduce:: When two rules are applicable in the same situation.
|
||
* Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
|
||
* Stack Overflow:: What happens when stack gets full. How to avoid it.
|
||
|
||
|
||
File: bison.info, Node: Look-Ahead, Next: Shift/Reduce, Up: Algorithm
|
||
|
||
Look-Ahead Tokens
|
||
=================
|
||
|
||
The Bison parser does _not_ always reduce immediately as soon as the
|
||
last N tokens and groupings match a rule. This is because such a
|
||
simple strategy is inadequate to handle most languages. Instead, when a
|
||
reduction is possible, the parser sometimes "looks ahead" at the next
|
||
token in order to decide what to do.
|
||
|
||
When a token is read, it is not immediately shifted; first it
|
||
becomes the "look-ahead token", which is not on the stack. Now the
|
||
parser can perform one or more reductions of tokens and groupings on
|
||
the stack, while the look-ahead token remains off to the side. When no
|
||
more reductions should take place, the look-ahead token is shifted onto
|
||
the stack. This does not mean that all possible reductions have been
|
||
done; depending on the token type of the look-ahead token, some rules
|
||
may choose to delay their application.
|
||
|
||
Here is a simple case where look-ahead is needed. These three rules
|
||
define expressions which contain binary addition operators and postfix
|
||
unary factorial operators (`!'), and allow parentheses for grouping.
|
||
|
||
expr: term '+' expr
|
||
| term
|
||
;
|
||
|
||
term: '(' expr ')'
|
||
| term '!'
|
||
| NUMBER
|
||
;
|
||
|
||
Suppose that the tokens `1 + 2' have been read and shifted; what
|
||
should be done? If the following token is `)', then the first three
|
||
tokens must be reduced to form an `expr'. This is the only valid
|
||
course, because shifting the `)' would produce a sequence of symbols
|
||
`term ')'', and no rule allows this.
|
||
|
||
If the following token is `!', then it must be shifted immediately so
|
||
that `2 !' can be reduced to make a `term'. If instead the parser were
|
||
to reduce before shifting, `1 + 2' would become an `expr'. It would
|
||
then be impossible to shift the `!' because doing so would produce on
|
||
the stack the sequence of symbols `expr '!''. No rule allows that
|
||
sequence.
|
||
|
||
The current look-ahead token is stored in the variable `yychar'.
|
||
*Note Special Features for Use in Actions: Action Features.
|
||
|
||
|
||
File: bison.info, Node: Shift/Reduce, Next: Precedence, Prev: Look-Ahead, Up: Algorithm
|
||
|
||
Shift/Reduce Conflicts
|
||
======================
|
||
|
||
Suppose we are parsing a language which has if-then and if-then-else
|
||
statements, with a pair of rules like this:
|
||
|
||
if_stmt:
|
||
IF expr THEN stmt
|
||
| IF expr THEN stmt ELSE stmt
|
||
;
|
||
|
||
Here we assume that `IF', `THEN' and `ELSE' are terminal symbols for
|
||
specific keyword tokens.
|
||
|
||
When the `ELSE' token is read and becomes the look-ahead token, the
|
||
contents of the stack (assuming the input is valid) are just right for
|
||
reduction by the first rule. But it is also legitimate to shift the
|
||
`ELSE', because that would lead to eventual reduction by the second
|
||
rule.
|
||
|
||
This situation, where either a shift or a reduction would be valid,
|
||
is called a "shift/reduce conflict". Bison is designed to resolve
|
||
these conflicts by choosing to shift, unless otherwise directed by
|
||
operator precedence declarations. To see the reason for this, let's
|
||
contrast it with the other alternative.
|
||
|
||
Since the parser prefers to shift the `ELSE', the result is to attach
|
||
the else-clause to the innermost if-statement, making these two inputs
|
||
equivalent:
|
||
|
||
if x then if y then win (); else lose;
|
||
|
||
if x then do; if y then win (); else lose; end;
|
||
|
||
But if the parser chose to reduce when possible rather than shift,
|
||
the result would be to attach the else-clause to the outermost
|
||
if-statement, making these two inputs equivalent:
|
||
|
||
if x then if y then win (); else lose;
|
||
|
||
if x then do; if y then win (); end; else lose;
|
||
|
||
The conflict exists because the grammar as written is ambiguous:
|
||
either parsing of the simple nested if-statement is legitimate. The
|
||
established convention is that these ambiguities are resolved by
|
||
attaching the else-clause to the innermost if-statement; this is what
|
||
Bison accomplishes by choosing to shift rather than reduce. (It would
|
||
ideally be cleaner to write an unambiguous grammar, but that is very
|
||
hard to do in this case.) This particular ambiguity was first
|
||
encountered in the specifications of Algol 60 and is called the
|
||
"dangling `else'" ambiguity.
|
||
|
||
To avoid warnings from Bison about predictable, legitimate
|
||
shift/reduce conflicts, use the `%expect N' declaration. There will be
|
||
no warning as long as the number of shift/reduce conflicts is exactly N.
|
||
*Note Suppressing Conflict Warnings: Expect Decl.
|
||
|
||
The definition of `if_stmt' above is solely to blame for the
|
||
conflict, but the conflict does not actually appear without additional
|
||
rules. Here is a complete Bison input file that actually manifests the
|
||
conflict:
|
||
|
||
%token IF THEN ELSE variable
|
||
%%
|
||
stmt: expr
|
||
| if_stmt
|
||
;
|
||
|
||
if_stmt:
|
||
IF expr THEN stmt
|
||
| IF expr THEN stmt ELSE stmt
|
||
;
|
||
|
||
expr: variable
|
||
;
|
||
|
||
|
||
File: bison.info, Node: Precedence, Next: Contextual Precedence, Prev: Shift/Reduce, Up: Algorithm
|
||
|
||
Operator Precedence
|
||
===================
|
||
|
||
Another situation where shift/reduce conflicts appear is in
|
||
arithmetic expressions. Here shifting is not always the preferred
|
||
resolution; the Bison declarations for operator precedence allow you to
|
||
specify when to shift and when to reduce.
|
||
|
||
* Menu:
|
||
|
||
* Why Precedence:: An example showing why precedence is needed.
|
||
* Using Precedence:: How to specify precedence in Bison grammars.
|
||
* Precedence Examples:: How these features are used in the previous example.
|
||
* How Precedence:: How they work.
|
||
|
||
|
||
File: bison.info, Node: Why Precedence, Next: Using Precedence, Up: Precedence
|
||
|
||
When Precedence is Needed
|
||
-------------------------
|
||
|
||
Consider the following ambiguous grammar fragment (ambiguous because
|
||
the input `1 - 2 * 3' can be parsed in two different ways):
|
||
|
||
expr: expr '-' expr
|
||
| expr '*' expr
|
||
| expr '<' expr
|
||
| '(' expr ')'
|
||
...
|
||
;
|
||
|
||
Suppose the parser has seen the tokens `1', `-' and `2'; should it
|
||
reduce them via the rule for the subtraction operator? It depends on
|
||
the next token. Of course, if the next token is `)', we must reduce;
|
||
shifting is invalid because no single rule can reduce the token
|
||
sequence `- 2 )' or anything starting with that. But if the next token
|
||
is `*' or `<', we have a choice: either shifting or reduction would
|
||
allow the parse to complete, but with different results.
|
||
|
||
To decide which one Bison should do, we must consider the results.
|
||
If the next operator token OP is shifted, then it must be reduced first
|
||
in order to permit another opportunity to reduce the difference. The
|
||
result is (in effect) `1 - (2 OP 3)'. On the other hand, if the
|
||
subtraction is reduced before shifting OP, the result is
|
||
`(1 - 2) OP 3'. Clearly, then, the choice of shift or reduce should
|
||
depend on the relative precedence of the operators `-' and OP: `*'
|
||
should be shifted first, but not `<'.
|
||
|
||
What about input such as `1 - 2 - 5'; should this be `(1 - 2) - 5'
|
||
or should it be `1 - (2 - 5)'? For most operators we prefer the
|
||
former, which is called "left association". The latter alternative,
|
||
"right association", is desirable for assignment operators. The choice
|
||
of left or right association is a matter of whether the parser chooses
|
||
to shift or reduce when the stack contains `1 - 2' and the look-ahead
|
||
token is `-': shifting makes right-associativity.
|
||
|
||
|
||
File: bison.info, Node: Using Precedence, Next: Precedence Examples, Prev: Why Precedence, Up: Precedence
|
||
|
||
Specifying Operator Precedence
|
||
------------------------------
|
||
|
||
Bison allows you to specify these choices with the operator
|
||
precedence declarations `%left' and `%right'. Each such declaration
|
||
contains a list of tokens, which are operators whose precedence and
|
||
associativity is being declared. The `%left' declaration makes all
|
||
those operators left-associative and the `%right' declaration makes
|
||
them right-associative. A third alternative is `%nonassoc', which
|
||
declares that it is a syntax error to find the same operator twice "in a
|
||
row".
|
||
|
||
The relative precedence of different operators is controlled by the
|
||
order in which they are declared. The first `%left' or `%right'
|
||
declaration in the file declares the operators whose precedence is
|
||
lowest, the next such declaration declares the operators whose
|
||
precedence is a little higher, and so on.
|
||
|
||
|
||
File: bison.info, Node: Precedence Examples, Next: How Precedence, Prev: Using Precedence, Up: Precedence
|
||
|
||
Precedence Examples
|
||
-------------------
|
||
|
||
In our example, we would want the following declarations:
|
||
|
||
%left '<'
|
||
%left '-'
|
||
%left '*'
|
||
|
||
In a more complete example, which supports other operators as well,
|
||
we would declare them in groups of equal precedence. For example,
|
||
`'+'' is declared with `'-'':
|
||
|
||
%left '<' '>' '=' NE LE GE
|
||
%left '+' '-'
|
||
%left '*' '/'
|
||
|
||
(Here `NE' and so on stand for the operators for "not equal" and so on.
|
||
We assume that these tokens are more than one character long and
|
||
therefore are represented by names, not character literals.)
|
||
|
||
|
||
File: bison.info, Node: How Precedence, Prev: Precedence Examples, Up: Precedence
|
||
|
||
How Precedence Works
|
||
--------------------
|
||
|
||
The first effect of the precedence declarations is to assign
|
||
precedence levels to the terminal symbols declared. The second effect
|
||
is to assign precedence levels to certain rules: each rule gets its
|
||
precedence from the last terminal symbol mentioned in the components.
|
||
(You can also specify explicitly the precedence of a rule. *Note
|
||
Context-Dependent Precedence: Contextual Precedence.)
|
||
|
||
Finally, the resolution of conflicts works by comparing the
|
||
precedence of the rule being considered with that of the look-ahead
|
||
token. If the token's precedence is higher, the choice is to shift.
|
||
If the rule's precedence is higher, the choice is to reduce. If they
|
||
have equal precedence, the choice is made based on the associativity of
|
||
that precedence level. The verbose output file made by `-v' (*note
|
||
Invoking Bison: Invocation.) says how each conflict was resolved.
|
||
|
||
Not all rules and not all tokens have precedence. If either the
|
||
rule or the look-ahead token has no precedence, then the default is to
|
||
shift.
|
||
|
||
|
||
File: bison.info, Node: Contextual Precedence, Next: Parser States, Prev: Precedence, Up: Algorithm
|
||
|
||
Context-Dependent Precedence
|
||
============================
|
||
|
||
Often the precedence of an operator depends on the context. This
|
||
sounds outlandish at first, but it is really very common. For example,
|
||
a minus sign typically has a very high precedence as a unary operator,
|
||
and a somewhat lower precedence (lower than multiplication) as a binary
|
||
operator.
|
||
|
||
The Bison precedence declarations, `%left', `%right' and
|
||
`%nonassoc', can only be used once for a given token; so a token has
|
||
only one precedence declared in this way. For context-dependent
|
||
precedence, you need to use an additional mechanism: the `%prec'
|
||
modifier for rules.
|
||
|
||
The `%prec' modifier declares the precedence of a particular rule by
|
||
specifying a terminal symbol whose precedence should be used for that
|
||
rule. It's not necessary for that symbol to appear otherwise in the
|
||
rule. The modifier's syntax is:
|
||
|
||
%prec TERMINAL-SYMBOL
|
||
|
||
and it is written after the components of the rule. Its effect is to
|
||
assign the rule the precedence of TERMINAL-SYMBOL, overriding the
|
||
precedence that would be deduced for it in the ordinary way. The
|
||
altered rule precedence then affects how conflicts involving that rule
|
||
are resolved (*note Operator Precedence: Precedence.).
|
||
|
||
Here is how `%prec' solves the problem of unary minus. First,
|
||
declare a precedence for a fictitious terminal symbol named `UMINUS'.
|
||
There are no tokens of this type, but the symbol serves to stand for its
|
||
precedence:
|
||
|
||
...
|
||
%left '+' '-'
|
||
%left '*'
|
||
%left UMINUS
|
||
|
||
Now the precedence of `UMINUS' can be used in specific rules:
|
||
|
||
exp: ...
|
||
| exp '-' exp
|
||
...
|
||
| '-' exp %prec UMINUS
|
||
|
||
|
||
File: bison.info, Node: Parser States, Next: Reduce/Reduce, Prev: Contextual Precedence, Up: Algorithm
|
||
|
||
Parser States
|
||
=============
|
||
|
||
The function `yyparse' is implemented using a finite-state machine.
|
||
The values pushed on the parser stack are not simply token type codes;
|
||
they represent the entire sequence of terminal and nonterminal symbols
|
||
at or near the top of the stack. The current state collects all the
|
||
information about previous input which is relevant to deciding what to
|
||
do next.
|
||
|
||
Each time a look-ahead token is read, the current parser state
|
||
together with the type of look-ahead token are looked up in a table.
|
||
This table entry can say, "Shift the look-ahead token." In this case,
|
||
it also specifies the new parser state, which is pushed onto the top of
|
||
the parser stack. Or it can say, "Reduce using rule number N." This
|
||
means that a certain number of tokens or groupings are taken off the
|
||
top of the stack, and replaced by one grouping. In other words, that
|
||
number of states are popped from the stack, and one new state is pushed.
|
||
|
||
There is one other alternative: the table can say that the
|
||
look-ahead token is erroneous in the current state. This causes error
|
||
processing to begin (*note Error Recovery::).
|
||
|
||
|
||
File: bison.info, Node: Reduce/Reduce, Next: Mystery Conflicts, Prev: Parser States, Up: Algorithm
|
||
|
||
Reduce/Reduce Conflicts
|
||
=======================
|
||
|
||
A reduce/reduce conflict occurs if there are two or more rules that
|
||
apply to the same sequence of input. This usually indicates a serious
|
||
error in the grammar.
|
||
|
||
For example, here is an erroneous attempt to define a sequence of
|
||
zero or more `word' groupings.
|
||
|
||
sequence: /* empty */
|
||
{ printf ("empty sequence\n"); }
|
||
| maybeword
|
||
| sequence word
|
||
{ printf ("added word %s\n", $2); }
|
||
;
|
||
|
||
maybeword: /* empty */
|
||
{ printf ("empty maybeword\n"); }
|
||
| word
|
||
{ printf ("single word %s\n", $1); }
|
||
;
|
||
|
||
The error is an ambiguity: there is more than one way to parse a single
|
||
`word' into a `sequence'. It could be reduced to a `maybeword' and
|
||
then into a `sequence' via the second rule. Alternatively,
|
||
nothing-at-all could be reduced into a `sequence' via the first rule,
|
||
and this could be combined with the `word' using the third rule for
|
||
`sequence'.
|
||
|
||
There is also more than one way to reduce nothing-at-all into a
|
||
`sequence'. This can be done directly via the first rule, or
|
||
indirectly via `maybeword' and then the second rule.
|
||
|
||
You might think that this is a distinction without a difference,
|
||
because it does not change whether any particular input is valid or
|
||
not. But it does affect which actions are run. One parsing order runs
|
||
the second rule's action; the other runs the first rule's action and
|
||
the third rule's action. In this example, the output of the program
|
||
changes.
|
||
|
||
Bison resolves a reduce/reduce conflict by choosing to use the rule
|
||
that appears first in the grammar, but it is very risky to rely on
|
||
this. Every reduce/reduce conflict must be studied and usually
|
||
eliminated. Here is the proper way to define `sequence':
|
||
|
||
sequence: /* empty */
|
||
{ printf ("empty sequence\n"); }
|
||
| sequence word
|
||
{ printf ("added word %s\n", $2); }
|
||
;
|
||
|
||
Here is another common error that yields a reduce/reduce conflict:
|
||
|
||
sequence: /* empty */
|
||
| sequence words
|
||
| sequence redirects
|
||
;
|
||
|
||
words: /* empty */
|
||
| words word
|
||
;
|
||
|
||
redirects:/* empty */
|
||
| redirects redirect
|
||
;
|
||
|
||
The intention here is to define a sequence which can contain either
|
||
`word' or `redirect' groupings. The individual definitions of
|
||
`sequence', `words' and `redirects' are error-free, but the three
|
||
together make a subtle ambiguity: even an empty input can be parsed in
|
||
infinitely many ways!
|
||
|
||
Consider: nothing-at-all could be a `words'. Or it could be two
|
||
`words' in a row, or three, or any number. It could equally well be a
|
||
`redirects', or two, or any number. Or it could be a `words' followed
|
||
by three `redirects' and another `words'. And so on.
|
||
|
||
Here are two ways to correct these rules. First, to make it a
|
||
single level of sequence:
|
||
|
||
sequence: /* empty */
|
||
| sequence word
|
||
| sequence redirect
|
||
;
|
||
|
||
Second, to prevent either a `words' or a `redirects' from being
|
||
empty:
|
||
|
||
sequence: /* empty */
|
||
| sequence words
|
||
| sequence redirects
|
||
;
|
||
|
||
words: word
|
||
| words word
|
||
;
|
||
|
||
redirects:redirect
|
||
| redirects redirect
|
||
;
|
||
|
||
|
||
File: bison.info, Node: Mystery Conflicts, Next: Stack Overflow, Prev: Reduce/Reduce, Up: Algorithm
|
||
|
||
Mysterious Reduce/Reduce Conflicts
|
||
==================================
|
||
|
||
Sometimes reduce/reduce conflicts can occur that don't look
|
||
warranted. Here is an example:
|
||
|
||
%token ID
|
||
|
||
%%
|
||
def: param_spec return_spec ','
|
||
;
|
||
param_spec:
|
||
type
|
||
| name_list ':' type
|
||
;
|
||
return_spec:
|
||
type
|
||
| name ':' type
|
||
;
|
||
type: ID
|
||
;
|
||
name: ID
|
||
;
|
||
name_list:
|
||
name
|
||
| name ',' name_list
|
||
;
|
||
|
||
It would seem that this grammar can be parsed with only a single
|
||
token of look-ahead: when a `param_spec' is being read, an `ID' is a
|
||
`name' if a comma or colon follows, or a `type' if another `ID'
|
||
follows. In other words, this grammar is LR(1).
|
||
|
||
However, Bison, like most parser generators, cannot actually handle
|
||
all LR(1) grammars. In this grammar, two contexts, that after an `ID'
|
||
at the beginning of a `param_spec' and likewise at the beginning of a
|
||
`return_spec', are similar enough that Bison assumes they are the same.
|
||
They appear similar because the same set of rules would be active--the
|
||
rule for reducing to a `name' and that for reducing to a `type'. Bison
|
||
is unable to determine at that stage of processing that the rules would
|
||
require different look-ahead tokens in the two contexts, so it makes a
|
||
single parser state for them both. Combining the two contexts causes a
|
||
conflict later. In parser terminology, this occurrence means that the
|
||
grammar is not LALR(1).
|
||
|
||
In general, it is better to fix deficiencies than to document them.
|
||
But this particular deficiency is intrinsically hard to fix; parser
|
||
generators that can handle LR(1) grammars are hard to write and tend to
|
||
produce parsers that are very large. In practice, Bison is more useful
|
||
as it is now.
|
||
|
||
When the problem arises, you can often fix it by identifying the two
|
||
parser states that are being confused, and adding something to make them
|
||
look distinct. In the above example, adding one rule to `return_spec'
|
||
as follows makes the problem go away:
|
||
|
||
%token BOGUS
|
||
...
|
||
%%
|
||
...
|
||
return_spec:
|
||
type
|
||
| name ':' type
|
||
/* This rule is never used. */
|
||
| ID BOGUS
|
||
;
|
||
|
||
This corrects the problem because it introduces the possibility of an
|
||
additional active rule in the context after the `ID' at the beginning of
|
||
`return_spec'. This rule is not active in the corresponding context in
|
||
a `param_spec', so the two contexts receive distinct parser states. As
|
||
long as the token `BOGUS' is never generated by `yylex', the added rule
|
||
cannot alter the way actual input is parsed.
|
||
|
||
In this particular example, there is another way to solve the
|
||
problem: rewrite the rule for `return_spec' to use `ID' directly
|
||
instead of via `name'. This also causes the two confusing contexts to
|
||
have different sets of active rules, because the one for `return_spec'
|
||
activates the altered rule for `return_spec' rather than the one for
|
||
`name'.
|
||
|
||
param_spec:
|
||
type
|
||
| name_list ':' type
|
||
;
|
||
return_spec:
|
||
type
|
||
| ID ':' type
|
||
;
|
||
|
||
|
||
File: bison.info, Node: Stack Overflow, Prev: Mystery Conflicts, Up: Algorithm
|
||
|
||
Stack Overflow, and How to Avoid It
|
||
===================================
|
||
|
||
The Bison parser stack can overflow if too many tokens are shifted
|
||
and not reduced. When this happens, the parser function `yyparse'
|
||
returns a nonzero value, pausing only to call `yyerror' to report the
|
||
overflow.
|
||
|
||
By defining the macro `YYMAXDEPTH', you can control how deep the
|
||
parser stack can become before a stack overflow occurs. Define the
|
||
macro with a value that is an integer. This value is the maximum number
|
||
of tokens that can be shifted (and not reduced) before overflow. It
|
||
must be a constant expression whose value is known at compile time.
|
||
|
||
The stack space allowed is not necessarily allocated. If you
|
||
specify a large value for `YYMAXDEPTH', the parser actually allocates a
|
||
small stack at first, and then makes it bigger by stages as needed.
|
||
This increasing allocation happens automatically and silently.
|
||
Therefore, you do not need to make `YYMAXDEPTH' painfully small merely
|
||
to save space for ordinary inputs that do not need much stack.
|
||
|
||
The default value of `YYMAXDEPTH', if you do not define it, is 10000.
|
||
|
||
You can control how much stack is allocated initially by defining the
|
||
macro `YYINITDEPTH'. This value too must be a compile-time constant
|
||
integer. The default is 200.
|
||
|
||
|
||
File: bison.info, Node: Error Recovery, Next: Context Dependency, Prev: Algorithm, Up: Top
|
||
|
||
Error Recovery
|
||
**************
|
||
|
||
It is not usually acceptable to have a program terminate on a parse
|
||
error. For example, a compiler should recover sufficiently to parse the
|
||
rest of the input file and check it for errors; a calculator should
|
||
accept another expression.
|
||
|
||
In a simple interactive command parser where each input is one line,
|
||
it may be sufficient to allow `yyparse' to return 1 on error and have
|
||
the caller ignore the rest of the input line when that happens (and
|
||
then call `yyparse' again). But this is inadequate for a compiler,
|
||
because it forgets all the syntactic context leading up to the error.
|
||
A syntax error deep within a function in the compiler input should not
|
||
cause the compiler to treat the following line like the beginning of a
|
||
source file.
|
||
|
||
You can define how to recover from a syntax error by writing rules to
|
||
recognize the special token `error'. This is a terminal symbol that is
|
||
always defined (you need not declare it) and reserved for error
|
||
handling. The Bison parser generates an `error' token whenever a
|
||
syntax error happens; if you have provided a rule to recognize this
|
||
token in the current context, the parse can continue.
|
||
|
||
For example:
|
||
|
||
stmnts: /* empty string */
|
||
| stmnts '\n'
|
||
| stmnts exp '\n'
|
||
| stmnts error '\n'
|
||
|
||
The fourth rule in this example says that an error followed by a
|
||
newline makes a valid addition to any `stmnts'.
|
||
|
||
What happens if a syntax error occurs in the middle of an `exp'? The
|
||
error recovery rule, interpreted strictly, applies to the precise
|
||
sequence of a `stmnts', an `error' and a newline. If an error occurs in
|
||
the middle of an `exp', there will probably be some additional tokens
|
||
and subexpressions on the stack after the last `stmnts', and there will
|
||
be tokens to read before the next newline. So the rule is not
|
||
applicable in the ordinary way.
|
||
|
||
But Bison can force the situation to fit the rule, by discarding
|
||
part of the semantic context and part of the input. First it discards
|
||
states and objects from the stack until it gets back to a state in
|
||
which the `error' token is acceptable. (This means that the
|
||
subexpressions already parsed are discarded, back to the last complete
|
||
`stmnts'.) At this point the `error' token can be shifted. Then, if
|
||
the old look-ahead token is not acceptable to be shifted next, the
|
||
parser reads tokens and discards them until it finds a token which is
|
||
acceptable. In this example, Bison reads and discards input until the
|
||
next newline so that the fourth rule can apply.
|
||
|
||
The choice of error rules in the grammar is a choice of strategies
|
||
for error recovery. A simple and useful strategy is simply to skip the
|
||
rest of the current input line or current statement if an error is
|
||
detected:
|
||
|
||
stmnt: error ';' /* on error, skip until ';' is read */
|
||
|
||
It is also useful to recover to the matching close-delimiter of an
|
||
opening-delimiter that has already been parsed. Otherwise the
|
||
close-delimiter will probably appear to be unmatched, and generate
|
||
another, spurious error message:
|
||
|
||
primary: '(' expr ')'
|
||
| '(' error ')'
|
||
...
|
||
;
|
||
|
||
Error recovery strategies are necessarily guesses. When they guess
|
||
wrong, one syntax error often leads to another. In the above example,
|
||
the error recovery rule guesses that an error is due to bad input
|
||
within one `stmnt'. Suppose that instead a spurious semicolon is
|
||
inserted in the middle of a valid `stmnt'. After the error recovery
|
||
rule recovers from the first error, another syntax error will be found
|
||
straightaway, since the text following the spurious semicolon is also
|
||
an invalid `stmnt'.
|
||
|
||
To prevent an outpouring of error messages, the parser will output
|
||
no error message for another syntax error that happens shortly after
|
||
the first; only after three consecutive input tokens have been
|
||
successfully shifted will error messages resume.
|
||
|
||
Note that rules which accept the `error' token may have actions, just
|
||
as any other rules can.
|
||
|
||
You can make error messages resume immediately by using the macro
|
||
`yyerrok' in an action. If you do this in the error rule's action, no
|
||
error messages will be suppressed. This macro requires no arguments;
|
||
`yyerrok;' is a valid C statement.
|
||
|
||
The previous look-ahead token is reanalyzed immediately after an
|
||
error. If this is unacceptable, then the macro `yyclearin' may be used
|
||
to clear this token. Write the statement `yyclearin;' in the error
|
||
rule's action.
|
||
|
||
For example, suppose that on a parse error, an error handling
|
||
routine is called that advances the input stream to some point where
|
||
parsing should once again commence. The next symbol returned by the
|
||
lexical scanner is probably correct. The previous look-ahead token
|
||
ought to be discarded with `yyclearin;'.
|
||
|
||
The macro `YYRECOVERING' stands for an expression that has the value
|
||
1 when the parser is recovering from a syntax error, and 0 the rest of
|
||
the time. A value of 1 indicates that error messages are currently
|
||
suppressed for new syntax errors.
|
||
|
||
|
||
File: bison.info, Node: Context Dependency, Next: Debugging, Prev: Error Recovery, Up: Top
|
||
|
||
Handling Context Dependencies
|
||
*****************************
|
||
|
||
The Bison paradigm is to parse tokens first, then group them into
|
||
larger syntactic units. In many languages, the meaning of a token is
|
||
affected by its context. Although this violates the Bison paradigm,
|
||
certain techniques (known as "kludges") may enable you to write Bison
|
||
parsers for such languages.
|
||
|
||
* Menu:
|
||
|
||
* Semantic Tokens:: Token parsing can depend on the semantic context.
|
||
* Lexical Tie-ins:: Token parsing can depend on the syntactic context.
|
||
* Tie-in Recovery:: Lexical tie-ins have implications for how
|
||
error recovery rules must be written.
|
||
|
||
(Actually, "kludge" means any technique that gets its job done but is
|
||
neither clean nor robust.)
|
||
|
||
|
||
File: bison.info, Node: Semantic Tokens, Next: Lexical Tie-ins, Up: Context Dependency
|
||
|
||
Semantic Info in Token Types
|
||
============================
|
||
|
||
The C language has a context dependency: the way an identifier is
|
||
used depends on what its current meaning is. For example, consider
|
||
this:
|
||
|
||
foo (x);
|
||
|
||
This looks like a function call statement, but if `foo' is a typedef
|
||
name, then this is actually a declaration of `x'. How can a Bison
|
||
parser for C decide how to parse this input?
|
||
|
||
The method used in GNU C is to have two different token types,
|
||
`IDENTIFIER' and `TYPENAME'. When `yylex' finds an identifier, it
|
||
looks up the current declaration of the identifier in order to decide
|
||
which token type to return: `TYPENAME' if the identifier is declared as
|
||
a typedef, `IDENTIFIER' otherwise.
|
||
|
||
The grammar rules can then express the context dependency by the
|
||
choice of token type to recognize. `IDENTIFIER' is accepted as an
|
||
expression, but `TYPENAME' is not. `TYPENAME' can start a declaration,
|
||
but `IDENTIFIER' cannot. In contexts where the meaning of the
|
||
identifier is _not_ significant, such as in declarations that can
|
||
shadow a typedef name, either `TYPENAME' or `IDENTIFIER' is
|
||
accepted--there is one rule for each of the two token types.
|
||
|
||
This technique is simple to use if the decision of which kinds of
|
||
identifiers to allow is made at a place close to where the identifier is
|
||
parsed. But in C this is not always so: C allows a declaration to
|
||
redeclare a typedef name provided an explicit type has been specified
|
||
earlier:
|
||
|
||
typedef int foo, bar, lose;
|
||
static foo (bar); /* redeclare `bar' as static variable */
|
||
static int foo (lose); /* redeclare `foo' as function */
|
||
|
||
Unfortunately, the name being declared is separated from the
|
||
declaration construct itself by a complicated syntactic structure--the
|
||
"declarator".
|
||
|
||
As a result, part of the Bison parser for C needs to be duplicated,
|
||
with all the nonterminal names changed: once for parsing a declaration
|
||
in which a typedef name can be redefined, and once for parsing a
|
||
declaration in which that can't be done. Here is a part of the
|
||
duplication, with actions omitted for brevity:
|
||
|
||
initdcl:
|
||
declarator maybeasm '='
|
||
init
|
||
| declarator maybeasm
|
||
;
|
||
|
||
notype_initdcl:
|
||
notype_declarator maybeasm '='
|
||
init
|
||
| notype_declarator maybeasm
|
||
;
|
||
|
||
Here `initdcl' can redeclare a typedef name, but `notype_initdcl'
|
||
cannot. The distinction between `declarator' and `notype_declarator'
|
||
is the same sort of thing.
|
||
|
||
There is some similarity between this technique and a lexical tie-in
|
||
(described next), in that information which alters the lexical analysis
|
||
is changed during parsing by other parts of the program. The
|
||
difference is here the information is global, and is used for other
|
||
purposes in the program. A true lexical tie-in has a special-purpose
|
||
flag controlled by the syntactic context.
|
||
|
||
|
||
File: bison.info, Node: Lexical Tie-ins, Next: Tie-in Recovery, Prev: Semantic Tokens, Up: Context Dependency
|
||
|
||
Lexical Tie-ins
|
||
===============
|
||
|
||
One way to handle context-dependency is the "lexical tie-in": a flag
|
||
which is set by Bison actions, whose purpose is to alter the way tokens
|
||
are parsed.
|
||
|
||
For example, suppose we have a language vaguely like C, but with a
|
||
special construct `hex (HEX-EXPR)'. After the keyword `hex' comes an
|
||
expression in parentheses in which all integers are hexadecimal. In
|
||
particular, the token `a1b' must be treated as an integer rather than
|
||
as an identifier if it appears in that context. Here is how you can do
|
||
it:
|
||
|
||
%{
|
||
int hexflag;
|
||
%}
|
||
%%
|
||
...
|
||
expr: IDENTIFIER
|
||
| constant
|
||
| HEX '('
|
||
{ hexflag = 1; }
|
||
expr ')'
|
||
{ hexflag = 0;
|
||
$$ = $4; }
|
||
| expr '+' expr
|
||
{ $$ = make_sum ($1, $3); }
|
||
...
|
||
;
|
||
|
||
constant:
|
||
INTEGER
|
||
| STRING
|
||
;
|
||
|
||
Here we assume that `yylex' looks at the value of `hexflag'; when it is
|
||
nonzero, all integers are parsed in hexadecimal, and tokens starting
|
||
with letters are parsed as integers if possible.
|
||
|
||
The declaration of `hexflag' shown in the C declarations section of
|
||
the parser file is needed to make it accessible to the actions (*note
|
||
The C Declarations Section: C Declarations.). You must also write the
|
||
code in `yylex' to obey the flag.
|
||
|
||
|
||
File: bison.info, Node: Tie-in Recovery, Prev: Lexical Tie-ins, Up: Context Dependency
|
||
|
||
Lexical Tie-ins and Error Recovery
|
||
==================================
|
||
|
||
Lexical tie-ins make strict demands on any error recovery rules you
|
||
have. *Note Error Recovery::.
|
||
|
||
The reason for this is that the purpose of an error recovery rule is
|
||
to abort the parsing of one construct and resume in some larger
|
||
construct. For example, in C-like languages, a typical error recovery
|
||
rule is to skip tokens until the next semicolon, and then start a new
|
||
statement, like this:
|
||
|
||
stmt: expr ';'
|
||
| IF '(' expr ')' stmt { ... }
|
||
...
|
||
error ';'
|
||
{ hexflag = 0; }
|
||
;
|
||
|
||
If there is a syntax error in the middle of a `hex (EXPR)'
|
||
construct, this error rule will apply, and then the action for the
|
||
completed `hex (EXPR)' will never run. So `hexflag' would remain set
|
||
for the entire rest of the input, or until the next `hex' keyword,
|
||
causing identifiers to be misinterpreted as integers.
|
||
|
||
To avoid this problem the error recovery rule itself clears
|
||
`hexflag'.
|
||
|
||
There may also be an error recovery rule that works within
|
||
expressions. For example, there could be a rule which applies within
|
||
parentheses and skips to the close-parenthesis:
|
||
|
||
expr: ...
|
||
| '(' expr ')'
|
||
{ $$ = $2; }
|
||
| '(' error ')'
|
||
...
|
||
|
||
If this rule acts within the `hex' construct, it is not going to
|
||
abort that construct (since it applies to an inner level of parentheses
|
||
within the construct). Therefore, it should not clear the flag: the
|
||
rest of the `hex' construct should be parsed with the flag still in
|
||
effect.
|
||
|
||
What if there is an error recovery rule which might abort out of the
|
||
`hex' construct or might not, depending on circumstances? There is no
|
||
way you can write the action to determine whether a `hex' construct is
|
||
being aborted or not. So if you are using a lexical tie-in, you had
|
||
better make sure your error recovery rules are not of this kind. Each
|
||
rule must be such that you can be sure that it always will, or always
|
||
won't, have to clear the flag.
|
||
|
||
|
||
File: bison.info, Node: Debugging, Next: Invocation, Prev: Context Dependency, Up: Top
|
||
|
||
Debugging Your Parser
|
||
*********************
|
||
|
||
If a Bison grammar compiles properly but doesn't do what you want
|
||
when it runs, the `yydebug' parser-trace feature can help you figure
|
||
out why.
|
||
|
||
To enable compilation of trace facilities, you must define the macro
|
||
`YYDEBUG' when you compile the parser. You could use `-DYYDEBUG=1' as
|
||
a compiler option or you could put `#define YYDEBUG 1' in the C
|
||
declarations section of the grammar file (*note The C Declarations
|
||
Section: C Declarations.). Alternatively, use the `-t' option when you
|
||
run Bison (*note Invoking Bison: Invocation.). We always define
|
||
`YYDEBUG' so that debugging is always possible.
|
||
|
||
The trace facility uses `stderr', so you must add
|
||
`#include <stdio.h>' to the C declarations section unless it is already
|
||
there.
|
||
|
||
Once you have compiled the program with trace facilities, the way to
|
||
request a trace is to store a nonzero value in the variable `yydebug'.
|
||
You can do this by making the C code do it (in `main', perhaps), or you
|
||
can alter the value with a C debugger.
|
||
|
||
Each step taken by the parser when `yydebug' is nonzero produces a
|
||
line or two of trace information, written on `stderr'. The trace
|
||
messages tell you these things:
|
||
|
||
* Each time the parser calls `yylex', what kind of token was read.
|
||
|
||
* Each time a token is shifted, the depth and complete contents of
|
||
the state stack (*note Parser States::).
|
||
|
||
* Each time a rule is reduced, which rule it is, and the complete
|
||
contents of the state stack afterward.
|
||
|
||
To make sense of this information, it helps to refer to the listing
|
||
file produced by the Bison `-v' option (*note Invoking Bison:
|
||
Invocation.). This file shows the meaning of each state in terms of
|
||
positions in various rules, and also what each state will do with each
|
||
possible input token. As you read the successive trace messages, you
|
||
can see that the parser is functioning according to its specification
|
||
in the listing file. Eventually you will arrive at the place where
|
||
something undesirable happens, and you will see which parts of the
|
||
grammar are to blame.
|
||
|
||
The parser file is a C program and you can use C debuggers on it,
|
||
but it's not easy to interpret what it is doing. The parser function
|
||
is a finite-state machine interpreter, and aside from the actions it
|
||
executes the same code over and over. Only the values of variables
|
||
show where in the grammar it is working.
|
||
|
||
The debugging information normally gives the token type of each token
|
||
read, but not its semantic value. You can optionally define a macro
|
||
named `YYPRINT' to provide a way to print the value. If you define
|
||
`YYPRINT', it should take three arguments. The parser will pass a
|
||
standard I/O stream, the numeric code for the token type, and the token
|
||
value (from `yylval').
|
||
|
||
Here is an example of `YYPRINT' suitable for the multi-function
|
||
calculator (*note Declarations for `mfcalc': Mfcalc Decl.):
|
||
|
||
#define YYPRINT(file, type, value) yyprint (file, type, value)
|
||
|
||
static void
|
||
yyprint (FILE *file, int type, YYSTYPE value)
|
||
{
|
||
if (type == VAR)
|
||
fprintf (file, " %s", value.tptr->name);
|
||
else if (type == NUM)
|
||
fprintf (file, " %d", value.val);
|
||
}
|
||
|
||
|
||
File: bison.info, Node: Invocation, Next: Table of Symbols, Prev: Debugging, Up: Top
|
||
|
||
Invoking Bison
|
||
**************
|
||
|
||
The usual way to invoke Bison is as follows:
|
||
|
||
bison INFILE
|
||
|
||
Here INFILE is the grammar file name, which usually ends in `.y'.
|
||
The parser file's name is made by replacing the `.y' with `.tab.c'.
|
||
Thus, the `bison foo.y' filename yields `foo.tab.c', and the `bison
|
||
hack/foo.y' filename yields `hack/foo.tab.c'. It's is also possible, in
|
||
case you are writting C++ code instead of C in your grammar file, to
|
||
name it `foo.ypp' or `foo.y++'. Then, the output files will take an
|
||
extention like the given one as input (repectively `foo.tab.cpp' and
|
||
`foo.tab.c++'). This feature takes effect with all options that
|
||
manipulate filenames like `-o' or `-d'.
|
||
|
||
For example :
|
||
|
||
bison -d INFILE.YXX
|
||
|
||
will produce `infile.tab.cxx' and `infile.tab.hxx'. and
|
||
|
||
bison -d INFILE.Y -o OUTPUT.C++
|
||
|
||
will produce `output.c++' and `outfile.h++'.
|
||
|
||
* Menu:
|
||
|
||
* Bison Options:: All the options described in detail,
|
||
in alphabetical order by short options.
|
||
* Environment Variables:: Variables which affect Bison execution.
|
||
* Option Cross Key:: Alphabetical list of long options.
|
||
* VMS Invocation:: Bison command syntax on VMS.
|
||
|
||
|
||
File: bison.info, Node: Bison Options, Next: Environment Variables, Up: Invocation
|
||
|
||
Bison Options
|
||
=============
|
||
|
||
Bison supports both traditional single-letter options and mnemonic
|
||
long option names. Long option names are indicated with `--' instead of
|
||
`-'. Abbreviations for option names are allowed as long as they are
|
||
unique. When a long option takes an argument, like `--file-prefix',
|
||
connect the option name and the argument with `='.
|
||
|
||
Here is a list of options that can be used with Bison, alphabetized
|
||
by short option. It is followed by a cross key alphabetized by long
|
||
option.
|
||
|
||
Operations modes:
|
||
`-h'
|
||
`--help'
|
||
Print a summary of the command-line options to Bison and exit.
|
||
|
||
`-V'
|
||
`--version'
|
||
Print the version number of Bison and exit.
|
||
|
||
`-y'
|
||
`--yacc'
|
||
`--fixed-output-files'
|
||
Equivalent to `-o y.tab.c'; the parser output file is called
|
||
`y.tab.c', and the other outputs are called `y.output' and
|
||
`y.tab.h'. The purpose of this option is to imitate Yacc's output
|
||
file name conventions. Thus, the following shell script can
|
||
substitute for Yacc:
|
||
|
||
bison -y $*
|
||
|
||
Tuning the parser:
|
||
|
||
`-S FILE'
|
||
`--skeleton=FILE'
|
||
Specify the skeleton to use. You probably don't need this option
|
||
unless you are developing Bison.
|
||
|
||
`-t'
|
||
`--debug'
|
||
Output a definition of the macro `YYDEBUG' into the parser file, so
|
||
that the debugging facilities are compiled. *Note Debugging Your
|
||
Parser: Debugging.
|
||
|
||
`--locations'
|
||
Pretend that `%locations' was specified. *Note Decl Summary::.
|
||
|
||
`-p PREFIX'
|
||
`--name-prefix=PREFIX'
|
||
Pretend that `%name-prefix="PREFIX"' was specified. *Note Decl
|
||
Summary::.
|
||
|
||
`-l'
|
||
`--no-lines'
|
||
Don't put any `#line' preprocessor commands in the parser file.
|
||
Ordinarily Bison puts them in the parser file so that the C
|
||
compiler and debuggers will associate errors with your source
|
||
file, the grammar file. This option causes them to associate
|
||
errors with the parser file, treating it as an independent source
|
||
file in its own right.
|
||
|
||
`-n'
|
||
`--no-parser'
|
||
Pretend that `%no-parser' was specified. *Note Decl Summary::.
|
||
|
||
`-k'
|
||
`--token-table'
|
||
Pretend that `%token-table' was specified. *Note Decl Summary::.
|
||
|
||
Adjust the output:
|
||
|
||
`-d'
|
||
`--defines'
|
||
Pretend that `%defines' was specified, i.e., write an extra output
|
||
file containing macro definitions for the token type names defined
|
||
in the grammar and the semantic value type `YYSTYPE', as well as a
|
||
few `extern' variable declarations. *Note Decl Summary::.
|
||
|
||
`--defines=DEFINES-FILE'
|
||
Same as above, but save in the file DEFINES-FILE.
|
||
|
||
`-b FILE-PREFIX'
|
||
`--file-prefix=PREFIX'
|
||
Pretend that `%verbose' was specified, i.e, specify prefix to use
|
||
for all Bison output file names. *Note Decl Summary::.
|
||
|
||
`-v'
|
||
`--verbose'
|
||
Pretend that `%verbose' was specified, i.e, write an extra output
|
||
file containing verbose descriptions of the grammar and parser.
|
||
*Note Decl Summary::.
|
||
|
||
`-o FILENAME'
|
||
`--output=FILENAME'
|
||
Specify the FILENAME for the parser file.
|
||
|
||
The other output files' names are constructed from FILENAME as
|
||
described under the `-v' and `-d' options.
|
||
|
||
`-g'
|
||
Output a VCG definition of the LALR(1) grammar automaton computed
|
||
by Bison. If the grammar file is `foo.y', the VCG output file will
|
||
be `foo.vcg'.
|
||
|
||
`--graph=GRAPH-FILE'
|
||
The behaviour of -GRAPH is the same than `-g'. The only difference
|
||
is that it has an optionnal argument which is the name of the
|
||
output graph filename.
|
||
|
||
|
||
File: bison.info, Node: Environment Variables, Next: Option Cross Key, Prev: Bison Options, Up: Invocation
|
||
|
||
Environment Variables
|
||
=====================
|
||
|
||
Here is a list of environment variables which affect the way Bison
|
||
runs.
|
||
|
||
`BISON_SIMPLE'
|
||
`BISON_HAIRY'
|
||
Much of the parser generated by Bison is copied verbatim from a
|
||
file called `bison.simple'. If Bison cannot find that file, or if
|
||
you would like to direct Bison to use a different copy, setting the
|
||
environment variable `BISON_SIMPLE' to the path of the file will
|
||
cause Bison to use that copy instead.
|
||
|
||
When the `%semantic_parser' declaration is used, Bison copies from
|
||
a file called `bison.hairy' instead. The location of this file can
|
||
also be specified or overridden in a similar fashion, with the
|
||
`BISON_HAIRY' environment variable.
|
||
|
||
|
||
File: bison.info, Node: Option Cross Key, Next: VMS Invocation, Prev: Environment Variables, Up: Invocation
|
||
|
||
Option Cross Key
|
||
================
|
||
|
||
Here is a list of options, alphabetized by long option, to help you
|
||
find the corresponding short option.
|
||
|
||
--debug -t
|
||
--defines=DEFINES-FILE -d
|
||
--file-prefix=PREFIX -b FILE-PREFIX
|
||
--fixed-output-files --yacc -y
|
||
--graph=GRAPH-FILE -d
|
||
--help -h
|
||
--name-prefix=PREFIX -p NAME-PREFIX
|
||
--no-lines -l
|
||
--no-parser -n
|
||
--output=OUTFILE -o OUTFILE
|
||
--token-table -k
|
||
--verbose -v
|
||
--version -V
|
||
|