Some wrapping.

This commit is contained in:
Akim Demaille
2005-12-27 15:42:44 +00:00
parent 3b0ffc7ec1
commit f8e1c9e55b

View File

@@ -910,29 +910,27 @@ parser recognizes all valid declarations, according to the
limited syntax above, transparently. In fact, the user does not even limited syntax above, transparently. In fact, the user does not even
notice when the parser splits. notice when the parser splits.
So here we have a case where we can use the benefits of @acronym{GLR}, almost So here we have a case where we can use the benefits of @acronym{GLR},
without disadvantages. Even in simple cases like this, however, there almost without disadvantages. Even in simple cases like this, however,
are at least two potential problems to beware. there are at least two potential problems to beware. First, always
First, always analyze the conflicts reported by analyze the conflicts reported by Bison to make sure that @acronym{GLR}
Bison to make sure that @acronym{GLR} splitting is only done where it is splitting is only done where it is intended. A @acronym{GLR} parser
intended. A @acronym{GLR} parser splitting inadvertently may cause splitting inadvertently may cause problems less obvious than an
problems less obvious than an @acronym{LALR} parser statically choosing the @acronym{LALR} parser statically choosing the wrong alternative in a
wrong alternative in a conflict. conflict. Second, consider interactions with the lexer (@pxref{Semantic
Second, consider interactions with the lexer (@pxref{Semantic Tokens}) Tokens}) with great care. Since a split parser consumes tokens without
with great care. Since a split parser consumes tokens performing any actions during the split, the lexer cannot obtain
without performing any actions during the split, the lexer cannot information via parser actions. Some cases of lexer interactions can be
obtain information via parser actions. Some cases of eliminated by using @acronym{GLR} to shift the complications from the
lexer interactions can be eliminated by using @acronym{GLR} to lexer to the parser. You must check the remaining cases for
shift the complications from the lexer to the parser. You must check correctness.
the remaining cases for correctness.
In our example, it would be safe for the lexer to return tokens In our example, it would be safe for the lexer to return tokens based on
based on their current meanings in some symbol table, because no new their current meanings in some symbol table, because no new symbols are
symbols are defined in the middle of a type declaration. Though it defined in the middle of a type declaration. Though it is possible for
is possible for a parser to define the enumeration a parser to define the enumeration constants as they are parsed, before
constants as they are parsed, before the type declaration is the type declaration is completed, it actually makes no difference since
completed, it actually makes no difference since they cannot be used they cannot be used within the same enumerated type declaration.
within the same enumerated type declaration.
@node Merging GLR Parses @node Merging GLR Parses
@subsection Using @acronym{GLR} to Resolve Ambiguities @subsection Using @acronym{GLR} to Resolve Ambiguities
@@ -2585,13 +2583,13 @@ continues until end of line.
@cindex Prologue @cindex Prologue
@cindex declarations @cindex declarations
The @var{Prologue} section contains macro definitions and The @var{Prologue} section contains macro definitions and declarations
declarations of functions and variables that are used in the actions in the of functions and variables that are used in the actions in the grammar
grammar rules. These are copied to the beginning of the parser file so rules. These are copied to the beginning of the parser file so that
that they precede the definition of @code{yyparse}. You can use they precede the definition of @code{yyparse}. You can use
@samp{#include} to get the declarations from a header file. If you don't @samp{#include} to get the declarations from a header file. If you
need any C declarations, you may omit the @samp{%@{} and @samp{%@}} don't need any C declarations, you may omit the @samp{%@{} and
delimiters that bracket this section. @samp{%@}} delimiters that bracket this section.
You may have more than one @var{Prologue} section, intermixed with the You may have more than one @var{Prologue} section, intermixed with the
@var{Bison declarations}. This allows you to have C and Bison @var{Bison declarations}. This allows you to have C and Bison
@@ -2661,10 +2659,10 @@ even if you define them in the Epilogue.
If the last section is empty, you may omit the @samp{%%} that separates it If the last section is empty, you may omit the @samp{%%} that separates it
from the grammar rules. from the grammar rules.
The Bison parser itself contains many macros and identifiers whose The Bison parser itself contains many macros and identifiers whose names
names start with @samp{yy} or @samp{YY}, so it is a start with @samp{yy} or @samp{YY}, so it is a good idea to avoid using
good idea to avoid using any such names (except those documented in this any such names (except those documented in this manual) in the epilogue
manual) in the epilogue of the grammar file. of the grammar file.
@node Symbols @node Symbols
@section Symbols, Terminal and Nonterminal @section Symbols, Terminal and Nonterminal
@@ -2680,13 +2678,13 @@ A @dfn{terminal symbol} (also known as a @dfn{token type}) represents a
class of syntactically equivalent tokens. You use the symbol in grammar class of syntactically equivalent tokens. You use the symbol in grammar
rules to mean that a token in that class is allowed. The symbol is rules to mean that a token in that class is allowed. The symbol is
represented in the Bison parser by a numeric code, and the @code{yylex} represented in the Bison parser by a numeric code, and the @code{yylex}
function returns a token type code to indicate what kind of token has been function returns a token type code to indicate what kind of token has
read. You don't need to know what the code value is; you can use the been read. You don't need to know what the code value is; you can use
symbol to stand for it. the symbol to stand for it.
A @dfn{nonterminal symbol} stands for a class of syntactically equivalent A @dfn{nonterminal symbol} stands for a class of syntactically
groupings. The symbol name is used in writing grammar rules. By convention, equivalent groupings. The symbol name is used in writing grammar rules.
it should be all lower case. By convention, it should be all lower case.
Symbol names can contain letters, digits (not at the beginning), Symbol names can contain letters, digits (not at the beginning),
underscores and periods. Periods make sense only in nonterminals. underscores and periods. Periods make sense only in nonterminals.
@@ -2791,17 +2789,17 @@ characters in the following C-language string:
"\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{|@}~" "\a\b\t\n\v\f\r !\"#%&'()*+,-./:;<=>?[\\]^_@{|@}~"
@end example @end example
The @code{yylex} function and Bison must use a consistent character The @code{yylex} function and Bison must use a consistent character set
set and encoding for character tokens. For example, if you run Bison in an and encoding for character tokens. For example, if you run Bison in an
@acronym{ASCII} environment, but then compile and run the resulting program @acronym{ASCII} environment, but then compile and run the resulting
in an environment that uses an incompatible character set like program in an environment that uses an incompatible character set like
@acronym{EBCDIC}, the resulting program may not work because the @acronym{EBCDIC}, the resulting program may not work because the tables
tables generated by Bison will assume @acronym{ASCII} numeric values for generated by Bison will assume @acronym{ASCII} numeric values for
character tokens. It is standard character tokens. It is standard practice for software distributions to
practice for software distributions to contain C source files that contain C source files that were generated by Bison in an
were generated by Bison in an @acronym{ASCII} environment, so installers on @acronym{ASCII} environment, so installers on platforms that are
platforms that are incompatible with @acronym{ASCII} must rebuild those incompatible with @acronym{ASCII} must rebuild those files before
files before compiling them. compiling them.
The symbol @code{error} is a terminal symbol reserved for error recovery The symbol @code{error} is a terminal symbol reserved for error recovery
(@pxref{Error Recovery}); you shouldn't use it for any other purpose. (@pxref{Error Recovery}); you shouldn't use it for any other purpose.
@@ -2908,10 +2906,10 @@ with no components.
@section Recursive Rules @section Recursive Rules
@cindex recursive rule @cindex recursive rule
A rule is called @dfn{recursive} when its @var{result} nonterminal appears A rule is called @dfn{recursive} when its @var{result} nonterminal
also on its right hand side. Nearly all Bison grammars need to use appears also on its right hand side. Nearly all Bison grammars need to
recursion, because that is the only way to define a sequence of any number use recursion, because that is the only way to define a sequence of any
of a particular thing. Consider this recursive definition of a number of a particular thing. Consider this recursive definition of a
comma-separated sequence of one or more expressions: comma-separated sequence of one or more expressions:
@example @example
@@ -3025,8 +3023,9 @@ This macro definition must go in the prologue of the grammar file
In most programs, you will need different data types for different kinds In most programs, you will need different data types for different kinds
of tokens and groupings. For example, a numeric constant may need type of tokens and groupings. For example, a numeric constant may need type
@code{int} or @code{long int}, while a string constant needs type @code{char *}, @code{int} or @code{long int}, while a string constant needs type
and an identifier might need a pointer to an entry in the symbol table. @code{char *}, and an identifier might need a pointer to an entry in the
symbol table.
To use more than one data type for semantic values in one parser, Bison To use more than one data type for semantic values in one parser, Bison
requires you to do two things: requires you to do two things:
@@ -4068,13 +4067,12 @@ is named @file{@var{name}.h}.
Unless @code{YYSTYPE} is already defined as a macro, the output header Unless @code{YYSTYPE} is already defined as a macro, the output header
declares @code{YYSTYPE}. Therefore, if you are using a @code{%union} declares @code{YYSTYPE}. Therefore, if you are using a @code{%union}
(@pxref{Multiple Types, ,More Than One Value Type}) with components (@pxref{Multiple Types, ,More Than One Value Type}) with components that
that require other definitions, or if you have defined a require other definitions, or if you have defined a @code{YYSTYPE} macro
@code{YYSTYPE} macro (@pxref{Value Type, ,Data Types of Semantic (@pxref{Value Type, ,Data Types of Semantic Values}), you need to
Values}), you need to arrange for these definitions to be propagated to arrange for these definitions to be propagated to all modules, e.g., by
all modules, e.g., by putting them in a putting them in a prerequisite header that is included both by your
prerequisite header that is included both by your parser and by any parser and by any other module that needs @code{YYSTYPE}.
other module that needs @code{YYSTYPE}.
Unless your parser is pure, the output header declares @code{yylval} Unless your parser is pure, the output header declares @code{yylval}
as an external variable. @xref{Pure Decl, ,A Pure (Reentrant) as an external variable. @xref{Pure Decl, ,A Pure (Reentrant)
@@ -4085,11 +4083,11 @@ If you have also used locations, the output header declares
@code{YYSTYPE} and @code{yylval}. @xref{Locations, ,Tracking @code{YYSTYPE} and @code{yylval}. @xref{Locations, ,Tracking
Locations}. Locations}.
This output file is normally essential if you wish to put the This output file is normally essential if you wish to put the definition
definition of @code{yylex} in a separate source file, because of @code{yylex} in a separate source file, because @code{yylex}
@code{yylex} typically needs to be able to refer to the typically needs to be able to refer to the above-mentioned declarations
above-mentioned declarations and to the token type codes. and to the token type codes. @xref{Token Values, ,Semantic Values of
@xref{Token Values, ,Semantic Values of Tokens}. Tokens}.
@end deffn @end deffn
@deffn {Directive} %destructor @deffn {Directive} %destructor
@@ -4500,12 +4498,11 @@ then the code in @code{yylex} might look like this:
@vindex yylloc @vindex yylloc
If you are using the @samp{@@@var{n}}-feature (@pxref{Locations, , If you are using the @samp{@@@var{n}}-feature (@pxref{Locations, ,
Tracking Locations}) in actions to keep track of the Tracking Locations}) in actions to keep track of the textual locations
textual locations of tokens and groupings, then you must provide this of tokens and groupings, then you must provide this information in
information in @code{yylex}. The function @code{yyparse} expects to @code{yylex}. The function @code{yyparse} expects to find the textual
find the textual location of a token just parsed in the global variable location of a token just parsed in the global variable @code{yylloc}.
@code{yylloc}. So @code{yylex} must store the proper data in that So @code{yylex} must store the proper data in that variable.
variable.
By default, the value of @code{yylloc} is a structure and you need only By default, the value of @code{yylloc} is a structure and you need only
initialize the members that are going to be used by the actions. The initialize the members that are going to be used by the actions. The
@@ -4842,12 +4839,11 @@ Tracking Locations}.
A Bison-generated parser can print diagnostics, including error and A Bison-generated parser can print diagnostics, including error and
tracing messages. By default, they appear in English. However, Bison tracing messages. By default, they appear in English. However, Bison
also supports outputting diagnostics in the user's native language. also supports outputting diagnostics in the user's native language. To
To make this work, the user should set the usual environment make this work, the user should set the usual environment variables.
variables. @xref{Users, , The User's View, gettext, GNU @xref{Users, , The User's View, gettext, GNU @code{gettext} utilities}.
@code{gettext} utilities}. For For example, the shell command @samp{export LC_ALL=fr_CA.UTF-8} might
example, the shell command @samp{export LC_ALL=fr_CA.UTF-8} might set set the user's locale to French Canadian using the @acronym{UTF}-8
the user's locale to French Canadian using the @acronym{UTF}-8
encoding. The exact set of available locales depends on the user's encoding. The exact set of available locales depends on the user's
installation. installation.