* doc/bison.texinfo: Update copyright date.

(Rpcalc Lexer, Symbols, Token Decl): Don't assume ASCII.
(Symbols): Warn about running Bison in one character set,
but compiling and/or running in an incompatible one.
Warn about character code 256, too.
This commit is contained in:
Paul Eggert
2002-04-04 21:34:34 +00:00
parent cd6a695eb9
commit e966383bf4
2 changed files with 41 additions and 8 deletions

View File

@@ -1,3 +1,18 @@
2002-04-04 Paul Eggert <eggert@twinsun.com>
* doc/bison.texinfo: Update copyright date.
(Rpcalc Lexer, Symbols, Token Decl): Don't assume ASCII.
(Symbols): Warn about running Bison in one character set,
but compiling and/or running in an incompatible one.
Warn about character code 256, too.
2002-04-03 Paul Eggert <eggert@twinsun.com>
* src/bison.data (YYSTACK_ALLOC): Depend on whether
YYERROR_VERBOSE is nonzero, not whether it is defined.
Merge changes from bison-1_29-branch.
2002-03-20 Paul Eggert <eggert@twinsun.com>
Merge fixes from Debian bison_1.34-1.diff.

View File

@@ -47,7 +47,7 @@ END-INFO-DIR-ENTRY
This file documents the Bison parser generator.
Copyright (C) 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998, 1999,
2000, 2001
2000, 2001, 2002
Free Software Foundation, Inc.
Permission is granted to make and distribute verbatim copies of
@@ -89,7 +89,7 @@ instead of in the original English.
@page
@vskip 0pt plus 1filll
Copyright @copyright{} 1988, 1989, 1990, 1991, 1992, 1993, 1995, 1998,
1999, 2000, 2001
1999, 2000, 2001, 2002
Free Software Foundation, Inc.
@sp 2
@@ -1083,7 +1083,7 @@ The return value of the lexical analyzer function is a numeric code which
represents a token type. The same text used in Bison rules to stand for
this token type is also a C expression for the numeric code for the type.
This works in two ways. If the token type is a character literal, then its
numeric code is the ASCII code for that character; you can use the same
numeric code is that of the character; you can use the same
character literal in the lexical analyzer to express the number. If the
token type is an identifier, that identifier is defined by Bison as a C
macro whose definition is the appropriate number. In this example,
@@ -1104,8 +1104,8 @@ Here is the code for the lexical analyzer:
@example
@group
/* Lexical analyzer returns a double floating point
number on the stack and the token NUM, or the ASCII
character read if not a number. Skips all blanks
number on the stack and the token NUM, or the numeric code
of the character read if not a number. Skips all blanks
and tabs, returns 0 for EOF. */
#include <ctype.h>
@@ -2148,7 +2148,7 @@ your program will confuse other readers.
All the usual escape sequences used in character literals in C can be
used in Bison as well, but you must not use the null character as a
character literal because its ASCII code, zero, is the code @code{yylex}
character literal because its numeric code, zero, is the code @code{yylex}
returns for end-of-input (@pxref{Calling Convention, ,Calling Convention
for @code{yylex}}).
@@ -2189,7 +2189,7 @@ on when the parser function returns that symbol.
The value returned by @code{yylex} is always one of the terminal symbols
(or 0 for end-of-input). Whichever way you write the token type in the
grammar rules, you write it the same way in the definition of @code{yylex}.
The numeric code for a character token type is simply the ASCII code for
The numeric code for a character token type is simply the numeric code of
the character, so @code{yylex} can use the identical character constant to
generate the requisite code. Each named token type becomes a C macro in
the parser file, so @code{yylex} can use the name to stand for the code.
@@ -2202,9 +2202,27 @@ option when you run Bison, so that it will write these macro definitions
into a separate header file @file{@var{name}.tab.h} which you can include
in the other source files that need it. @xref{Invocation, ,Invoking Bison}.
The @code{yylex} function must use the same character set and encoding
that was used by Bison. For example, if you run Bison in an
@sc{ascii} environment, but then compile and run the resulting program
in an environment that uses an incompatible character set like
@sc{ebcdic}, the resulting program will probably not work because the
tables generated by Bison will assume @sc{ascii} numeric values for
character tokens. Portable grammars should avoid non-@sc{ascii}
character tokens, as implementations in practice often use different
and incompatible extensions in this area. However, it is standard
practice for software distributions to contain C source files that
were generated by Bison in an @sc{ascii} environment, so installers on
platforms that are incompatible with @sc{ascii} must rebuild those
files before compiling them.
The symbol @code{error} is a terminal symbol reserved for error recovery
(@pxref{Error Recovery}); you shouldn't use it for any other purpose.
In particular, @code{yylex} should never return this value.
The default value of the error token is 256, so in the
unlikely event that you need to use a character token with numeric
value 256 you must reassign the error token's value with a
@code{%token} declaration.
@node Rules
@section Syntax of Grammar Rules
@@ -2942,7 +2960,7 @@ an integer value in the field immediately following the token name:
@noindent
It is generally best, however, to let Bison choose the numeric codes for
all token types. Bison will automatically select codes that don't conflict
with each other or with ASCII characters.
with each other or with normal characters.
In the event that the stack type is a union, you must augment the
@code{%token} or other token declaration to include the data type