doc: promote YYEOF

* NEWS (Deep overhaul of the symbol and token kinds): New.
* doc/bison.texi: Promote YYEOF over "0" in scanners.
(Token Decl): No longer show YYEOF here, it now works by default.
(Token I18n): More details about YYEOF here.
(Calc++): Just use YYEOF.
This commit is contained in:
Akim Demaille
2020-04-13 09:32:54 +02:00
parent 71e3f6d4da
commit dc1035bada
2 changed files with 56 additions and 38 deletions

35
NEWS
View File

@@ -74,7 +74,6 @@ GNU Bison NEWS
%token
PLUS "+"
MINUS "-"
EOF 0 _("end of file")
<double>
NUM _("double precision number")
<symrec*>
@@ -83,7 +82,7 @@ GNU Bison NEWS
In that case the user must define _() and N_(), and yysymbol_name returns
the translated symbol (i.e., it returns '_("variable")' rather that
'"variable"').
'"variable"'). In Java, the user must provide an i18n() function.
*** List of expected tokens (yacc.c)
@@ -95,6 +94,38 @@ GNU Bison NEWS
It makes little sense to use this feature without enabling LAC (lookahead
correction).
*** Deep overhaul of the symbol and token kinds
To avoid the confusion with typing in programming languages, we now refer
to token and symbol "kinds" instead of token and symbol "types".
**** Token kind
The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
LPAREN, etc. Users are invited to replace their uses of "enum
yytokentype" by "yytoken_kind_t".
This type now also includes tokens that were proviously hidden: YYEOF (end
of input), YYUNDEF (undefined token), and YYERRCODE (error token). They
now have string aliases, internationalized if internationalization is
enabled. Therefore, by default, error messages now refer to "end of file"
(internationalized) rather than the cryptic "$end".
In most case, it is now useless to define the end-of-line token as
follows:
%token EOF 0 _("end of file")
Rather simply use "YYEOF" in your scanner.
**** Symbol kinds
The "symbol kinds" is what the parser actually uses. (Unless the
api.token.raw %define variable was used, the internal symbol kind of a
terminal differs from the corresponding token kind.)
They are now exposed as a enum, "yysymbol_kind_t".
*** Modernize display of explanatory statements in diagnostics
Since Bison 2.7, output was indented four spaces for explanatory

View File

@@ -1903,7 +1903,7 @@ yylex (void)
@group
/* Return end-of-input. */
else if (c == EOF)
return 0;
return YYEOF;
/* Return a single char. */
else
return c;
@@ -2352,7 +2352,7 @@ yylex (void)
/* Return end-of-input. */
if (c == EOF)
return 0;
return YYEOF;
@group
/* Return a single char, and update location. */
@@ -2722,7 +2722,7 @@ yylex (void)
c = getchar ();
if (c == EOF)
return 0;
return YYEOF;
@end group
@group
@@ -4926,14 +4926,6 @@ would produce in French @samp{erreur de syntaxe, || inattendu, attendait
nombre ou (} rather than @samp{erreur de syntaxe, || inattendu, attendait
number ou (}.
The token numbered as 0 corresponds to the end of file; the following line
allows for nicer error messages referring to ``end of file''
(internationalized) instead of ``$end'':
@example
%token END 0 _("end of file")
@end example
@node Precedence Decl
@subsection Operator Precedence
@cindex precedence declarations
@@ -7812,7 +7804,6 @@ or @code{detailed}, token aliases can be internationalized:
@example
%token
'\n' _("end of line")
EOF 0 _("end of file")
<double>
NUM _("double precision number")
<symrec*>
@@ -7828,17 +7819,26 @@ If at least one token alias is internationalized, then the generated parser
will use both @code{N_} and @code{_}, that must be defined
(@pxref{Programmers, , The Programmers View, gettext, GNU @code{gettext}
utilities}). They are used only on string aliases marked for translation.
In other words, even if your catalog features a translation for ``end of
line'', then with
In other words, even if your catalog features a translation for
``function'', then with
@example
%token
'\n' "end of line"
EOF 0 _("end of file")
<symrec*>
FUN "function"
VAR _("variable")
@end example
@noindent
``end of line'' will appear untranslated in debug traces and error messages.
``function'' will appear untranslated in debug traces and error messages.
Unless defined by the user, the end-of-file token, @code{YYEOF}, is provided
``end of file'' as an alias. It is also internationalized if the user
internationalized tokens. To map it to another string, use:
@example
%token END 0 _("end of input")
@end example
@node Algorithm
@@ -11401,17 +11401,7 @@ Symbols}). This directive:
@noindent
requests that Bison generates the functions @code{make_TEXT} and
@code{make_NUMBER}. As a matter of fact, it is convenient to have also a
symbol to mark the end of input, say @code{END_OF_FILE}:
@comment file: c++/simple.yy: 1
@example
%token END_OF_FILE 0
@end example
@noindent
The @code{0} tells Bison this token is special: when it is reached, parsing
finishes.
@code{make_NUMBER}, but also @code{make_YYEOF}, for the end of input.
Everything is in place for our scanner:
@@ -11441,7 +11431,7 @@ Everything is in place for our scanner:
@end group
@group
default:
return parser::make_END_OF_FILE ();
return parser::make_YYEOF ();
@end group
@}
@}
@@ -12439,17 +12429,14 @@ file; it needs detailed knowledge about the driver.
@noindent
The token code 0 corresponds to end of file; the following line
allows for nicer error messages referring to ``end of file'' instead of
``$end''. Similarly user friendly names are provided for each symbol. To
avoid name clashes in the generated files (@pxref{Calc++ Scanner}), prefix
tokens with @code{TOK_} (@pxref{%define Summary}).
User friendly names are provided for each symbol. To avoid name clashes in
the generated files (@pxref{Calc++ Scanner}), prefix tokens with @code{TOK_}
(@pxref{%define Summary}).
@comment file: calc++/parser.yy
@example
%define api.token.prefix @{TOK_@}
%token
END 0 "end of file"
ASSIGN ":="
MINUS "-"
PLUS "+"
@@ -12695,7 +12682,7 @@ The rules are simple. The driver is used to report errors.
(loc, "invalid character: " + std::string(yytext));
@}
@end group
<<EOF>> return yy::parser::make_END (loc);
<<EOF>> return yy::parser::make_YYEOF (loc);
%%
@end example