mirror of
https://git.savannah.gnu.org/git/bison.git
synced 2026-03-23 03:03:02 +00:00
Some wrapping.
This commit is contained in:
@@ -910,29 +910,27 @@ parser recognizes all valid declarations, according to the
|
|||||||
limited syntax above, transparently. In fact, the user does not even
|
limited syntax above, transparently. In fact, the user does not even
|
||||||
notice when the parser splits.
|
notice when the parser splits.
|
||||||
|
|
||||||
So here we have a case where we can use the benefits of @acronym{GLR}, almost
|
So here we have a case where we can use the benefits of @acronym{GLR},
|
||||||
without disadvantages. Even in simple cases like this, however, there
|
almost without disadvantages. Even in simple cases like this, however,
|
||||||
are at least two potential problems to beware.
|
there are at least two potential problems to beware. First, always
|
||||||
First, always analyze the conflicts reported by
|
analyze the conflicts reported by Bison to make sure that @acronym{GLR}
|
||||||
Bison to make sure that @acronym{GLR} splitting is only done where it is
|
splitting is only done where it is intended. A @acronym{GLR} parser
|
||||||
intended. A @acronym{GLR} parser splitting inadvertently may cause
|
splitting inadvertently may cause problems less obvious than an
|
||||||
problems less obvious than an @acronym{LALR} parser statically choosing the
|
@acronym{LALR} parser statically choosing the wrong alternative in a
|
||||||
wrong alternative in a conflict.
|
conflict. Second, consider interactions with the lexer (@pxref{Semantic
|
||||||
Second, consider interactions with the lexer (@pxref{Semantic Tokens})
|
Tokens}) with great care. Since a split parser consumes tokens without
|
||||||
with great care. Since a split parser consumes tokens
|
performing any actions during the split, the lexer cannot obtain
|
||||||
without performing any actions during the split, the lexer cannot
|
information via parser actions. Some cases of lexer interactions can be
|
||||||
obtain information via parser actions. Some cases of
|
eliminated by using @acronym{GLR} to shift the complications from the
|
||||||
lexer interactions can be eliminated by using @acronym{GLR} to
|
lexer to the parser. You must check the remaining cases for
|
||||||
shift the complications from the lexer to the parser. You must check
|
correctness.
|
||||||
the remaining cases for correctness.
|
|
||||||
|
|
||||||
In our example, it would be safe for the lexer to return tokens
|
In our example, it would be safe for the lexer to return tokens based on
|
||||||
based on their current meanings in some symbol table, because no new
|
their current meanings in some symbol table, because no new symbols are
|
||||||
symbols are defined in the middle of a type declaration. Though it
|
defined in the middle of a type declaration. Though it is possible for
|
||||||
is possible for a parser to define the enumeration
|
a parser to define the enumeration constants as they are parsed, before
|
||||||
constants as they are parsed, before the type declaration is
|
the type declaration is completed, it actually makes no difference since
|
||||||
completed, it actually makes no difference since they cannot be used
|
they cannot be used within the same enumerated type declaration.
|
||||||
within the same enumerated type declaration.
|
|
||||||
|
|
||||||
@node Merging GLR Parses
|
@node Merging GLR Parses
|
||||||
@subsection Using @acronym{GLR} to Resolve Ambiguities
|
@subsection Using @acronym{GLR} to Resolve Ambiguities
|
||||||
@@ -2585,13 +2583,13 @@ continues until end of line.
|
|||||||
@cindex Prologue
|
@cindex Prologue
|
||||||
@cindex declarations
|
@cindex declarations
|
||||||
|
|
||||||
The @var{Prologue} section contains macro definitions and
|
The @var{Prologue} section contains macro definitions and declarations
|
||||||
declarations of functions and variables that are used in the actions in the
|
of functions and variables that are used in the actions in the grammar
|
||||||
grammar rules. These are copied to the beginning of the parser file so
|
rules. These are copied to the beginning of the parser file so that
|
||||||
that they precede the definition of @code{yyparse}. You can use
|
they precede the definition of @code{yyparse}. You can use
|
||||||
@samp{#include} to get the declarations from a header file. If you don't
|
@samp{#include} to get the declarations from a header file. If you
|
||||||
need any C declarations, you may omit the @samp{%@{} and @samp{%@}}
|
don't need any C declarations, you may omit the @samp{%@{} and
|
||||||
delimiters that bracket this section.
|
@samp{%@}} delimiters that bracket this section.
|
||||||
|
|
||||||
You may have more than one @var{Prologue} section, intermixed with the
|
You may have more than one @var{Prologue} section, intermixed with the
|
||||||
@var{Bison declarations}. This allows you to have C and Bison
|
@var{Bison declarations}. This allows you to have C and Bison
|
||||||
@@ -2661,10 +2659,10 @@ even if you define them in the Epilogue.
|
|||||||
If the last section is empty, you may omit the @samp{%%} that separates it
|
If the last section is empty, you may omit the @samp{%%} that separates it
|
||||||
from the grammar rules.
|
from the grammar rules.
|
||||||
|
|
||||||
The Bison parser itself contains many macros and identifiers whose
|
The Bison parser itself contains many macros and identifiers whose names
|
||||||
names start with @samp{yy} or @samp{YY}, so it is a
|
start with @samp{yy} or @samp{YY}, so it is a good idea to avoid using
|
||||||
good idea to avoid using any such names (except those documented in this
|
any such names (except those documented in this manual) in the epilogue
|
||||||
manual) in the epilogue of the grammar file.
|
of the grammar file.
|
||||||
|
|
||||||
@node Symbols
|
@node Symbols
|
||||||
@section Symbols, Terminal and Nonterminal
|
@section Symbols, Terminal and Nonterminal
|
||||||
@@ -2680,13 +2678,13 @@ A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a
|
|||||||
class of syntactically equivalent tokens. You use the symbol in grammar
|
class of syntactically equivalent tokens. You use the symbol in grammar
|
||||||
rules to mean that a token in that class is allowed. The symbol is
|
rules to mean that a token in that class is allowed. The symbol is
|
||||||
represented in the Bison parser by a numeric code, and the @code{yylex}
|
represented in the Bison parser by a numeric code, and the @code{yylex}
|
||||||
function returns a token type code to indicate what kind of token has been
|
function returns a token type code to indicate what kind of token has
|
||||||
read. You don't need to know what the code value is; you can use the
|
been read. You don't need to know what the code value is; you can use
|
||||||
symbol to stand for it.
|
the symbol to stand for it.
|
||||||
|
|
||||||
A @dfn{nonterminal symbol} stands for a class of syntactically equivalent
|
A @dfn{nonterminal symbol} stands for a class of syntactically
|
||||||
groupings. The symbol name is used in writing grammar rules. By convention,
|
equivalent groupings. The symbol name is used in writing grammar rules.
|
||||||
it should be all lower case.
|
By convention, it should be all lower case.
|
||||||
|
|
||||||
Symbol names can contain letters, digits (not at the beginning),
|
Symbol names can contain letters, digits (not at the beginning),
|
||||||
underscores and periods. Periods make sense only in nonterminals.
|
underscores and periods. Periods make sense only in nonterminals.
|
||||||
@@ -2791,17 +2789,17 @@ characters in the following C-language string:
|
|||||||
"\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{|@}~"
|
"\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{|@}~"
|
||||||
@end example
|
@end example
|
||||||
|
|
||||||
The @code{yylex} function and Bison must use a consistent character
|
The @code{yylex} function and Bison must use a consistent character set
|
||||||
set and encoding for character tokens. For example, if you run Bison in an
|
and encoding for character tokens. For example, if you run Bison in an
|
||||||
@acronym{ASCII} environment, but then compile and run the resulting program
|
@acronym{ASCII} environment, but then compile and run the resulting
|
||||||
in an environment that uses an incompatible character set like
|
program in an environment that uses an incompatible character set like
|
||||||
@acronym{EBCDIC}, the resulting program may not work because the
|
@acronym{EBCDIC}, the resulting program may not work because the tables
|
||||||
tables generated by Bison will assume @acronym{ASCII} numeric values for
|
generated by Bison will assume @acronym{ASCII} numeric values for
|
||||||
character tokens. It is standard
|
character tokens. It is standard practice for software distributions to
|
||||||
practice for software distributions to contain C source files that
|
contain C source files that were generated by Bison in an
|
||||||
were generated by Bison in an @acronym{ASCII} environment, so installers on
|
@acronym{ASCII} environment, so installers on platforms that are
|
||||||
platforms that are incompatible with @acronym{ASCII} must rebuild those
|
incompatible with @acronym{ASCII} must rebuild those files before
|
||||||
files before compiling them.
|
compiling them.
|
||||||
|
|
||||||
The symbol @code{error} is a terminal symbol reserved for error recovery
|
The symbol @code{error} is a terminal symbol reserved for error recovery
|
||||||
(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
|
(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
|
||||||
@@ -2908,10 +2906,10 @@ with no components.
|
|||||||
@section Recursive Rules
|
@section Recursive Rules
|
||||||
@cindex recursive rule
|
@cindex recursive rule
|
||||||
|
|
||||||
A rule is called @dfn{recursive} when its @var{result} nonterminal appears
|
A rule is called @dfn{recursive} when its @var{result} nonterminal
|
||||||
also on its right hand side. Nearly all Bison grammars need to use
|
appears also on its right hand side. Nearly all Bison grammars need to
|
||||||
recursion, because that is the only way to define a sequence of any number
|
use recursion, because that is the only way to define a sequence of any
|
||||||
of a particular thing. Consider this recursive definition of a
|
number of a particular thing. Consider this recursive definition of a
|
||||||
comma-separated sequence of one or more expressions:
|
comma-separated sequence of one or more expressions:
|
||||||
|
|
||||||
@example
|
@example
|
||||||
@@ -3025,8 +3023,9 @@ This macro definition must go in the prologue of the grammar file
|
|||||||
|
|
||||||
In most programs, you will need different data types for different kinds
|
In most programs, you will need different data types for different kinds
|
||||||
of tokens and groupings. For example, a numeric constant may need type
|
of tokens and groupings. For example, a numeric constant may need type
|
||||||
@code{int} or @code{long int}, while a string constant needs type @code{char *},
|
@code{int} or @code{long int}, while a string constant needs type
|
||||||
and an identifier might need a pointer to an entry in the symbol table.
|
@code{char *}, and an identifier might need a pointer to an entry in the
|
||||||
|
symbol table.
|
||||||
|
|
||||||
To use more than one data type for semantic values in one parser, Bison
|
To use more than one data type for semantic values in one parser, Bison
|
||||||
requires you to do two things:
|
requires you to do two things:
|
||||||
@@ -4068,13 +4067,12 @@ is named @file{@var{name}.h}.
|
|||||||
|
|
||||||
Unless @code{YYSTYPE} is already defined as a macro, the output header
|
Unless @code{YYSTYPE} is already defined as a macro, the output header
|
||||||
declares @code{YYSTYPE}. Therefore, if you are using a @code{%union}
|
declares @code{YYSTYPE}. Therefore, if you are using a @code{%union}
|
||||||
(@pxref{Multiple Types, ,More Than One Value Type}) with components
|
(@pxref{Multiple Types, ,More Than One Value Type}) with components that
|
||||||
that require other definitions, or if you have defined a
|
require other definitions, or if you have defined a @code{YYSTYPE} macro
|
||||||
@code{YYSTYPE} macro (@pxref{Value Type, ,Data Types of Semantic
|
(@pxref{Value Type, ,Data Types of Semantic Values}), you need to
|
||||||
Values}), you need to arrange for these definitions to be propagated to
|
arrange for these definitions to be propagated to all modules, e.g., by
|
||||||
all modules, e.g., by putting them in a
|
putting them in a prerequisite header that is included both by your
|
||||||
prerequisite header that is included both by your parser and by any
|
parser and by any other module that needs @code{YYSTYPE}.
|
||||||
other module that needs @code{YYSTYPE}.
|
|
||||||
|
|
||||||
Unless your parser is pure, the output header declares @code{yylval}
|
Unless your parser is pure, the output header declares @code{yylval}
|
||||||
as an external variable. @xref{Pure Decl, ,A Pure (Reentrant)
|
as an external variable. @xref{Pure Decl, ,A Pure (Reentrant)
|
||||||
@@ -4085,11 +4083,11 @@ If you have also used locations, the output header declares
|
|||||||
@code{YYSTYPE} and @code{yylval}. @xref{Locations, ,Tracking
|
@code{YYSTYPE} and @code{yylval}. @xref{Locations, ,Tracking
|
||||||
Locations}.
|
Locations}.
|
||||||
|
|
||||||
This output file is normally essential if you wish to put the
|
This output file is normally essential if you wish to put the definition
|
||||||
definition of @code{yylex} in a separate source file, because
|
of @code{yylex} in a separate source file, because @code{yylex}
|
||||||
@code{yylex} typically needs to be able to refer to the
|
typically needs to be able to refer to the above-mentioned declarations
|
||||||
above-mentioned declarations and to the token type codes.
|
and to the token type codes. @xref{Token Values, ,Semantic Values of
|
||||||
@xref{Token Values, ,Semantic Values of Tokens}.
|
Tokens}.
|
||||||
@end deffn
|
@end deffn
|
||||||
|
|
||||||
@deffn {Directive} %destructor
|
@deffn {Directive} %destructor
|
||||||
@@ -4500,12 +4498,11 @@ then the code in @code{yylex} might look like this:
|
|||||||
|
|
||||||
@vindex yylloc
|
@vindex yylloc
|
||||||
If you are using the @samp{@@@var{n}}-feature (@pxref{Locations, ,
|
If you are using the @samp{@@@var{n}}-feature (@pxref{Locations, ,
|
||||||
Tracking Locations}) in actions to keep track of the
|
Tracking Locations}) in actions to keep track of the textual locations
|
||||||
textual locations of tokens and groupings, then you must provide this
|
of tokens and groupings, then you must provide this information in
|
||||||
information in @code{yylex}. The function @code{yyparse} expects to
|
@code{yylex}. The function @code{yyparse} expects to find the textual
|
||||||
find the textual location of a token just parsed in the global variable
|
location of a token just parsed in the global variable @code{yylloc}.
|
||||||
@code{yylloc}. So @code{yylex} must store the proper data in that
|
So @code{yylex} must store the proper data in that variable.
|
||||||
variable.
|
|
||||||
|
|
||||||
By default, the value of @code{yylloc} is a structure and you need only
|
By default, the value of @code{yylloc} is a structure and you need only
|
||||||
initialize the members that are going to be used by the actions. The
|
initialize the members that are going to be used by the actions. The
|
||||||
@@ -4842,12 +4839,11 @@ Tracking Locations}.
|
|||||||
|
|
||||||
A Bison-generated parser can print diagnostics, including error and
|
A Bison-generated parser can print diagnostics, including error and
|
||||||
tracing messages. By default, they appear in English. However, Bison
|
tracing messages. By default, they appear in English. However, Bison
|
||||||
also supports outputting diagnostics in the user's native language.
|
also supports outputting diagnostics in the user's native language. To
|
||||||
To make this work, the user should set the usual environment
|
make this work, the user should set the usual environment variables.
|
||||||
variables. @xref{Users, , The User's View, gettext, GNU
|
@xref{Users, , The User's View, gettext, GNU @code{gettext} utilities}.
|
||||||
@code{gettext} utilities}. For
|
For example, the shell command @samp{export LC_ALL=fr_CA.UTF-8} might
|
||||||
example, the shell command @samp{export LC_ALL=fr_CA.UTF-8} might set
|
set the user's locale to French Canadian using the @acronym{UTF}-8
|
||||||
the user's locale to French Canadian using the @acronym{UTF}-8
|
|
||||||
encoding. The exact set of available locales depends on the user's
|
encoding. The exact set of available locales depends on the user's
|
||||||
installation.
|
installation.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user