doc: promote YYEOF

* NEWS (Deep overhaul of the symbol and token kinds): New.
* doc/bison.texi: Promote YYEOF over "0" in scanners.
(Token Decl): No longer show YYEOF here, it now works by default.
(Token I18n): More details about YYEOF here.
(Calc++): Just use YYEOF.
This commit is contained in:
Akim Demaille
2020-04-13 09:32:54 +02:00
parent 71e3f6d4da
commit dc1035bada
2 changed files with 56 additions and 38 deletions

35
NEWS
View File

@@ -74,7 +74,6 @@ GNU Bison NEWS
%token %token
PLUS "+" PLUS "+"
MINUS "-" MINUS "-"
EOF 0 _("end of file")
<double> <double>
NUM _("double precision number") NUM _("double precision number")
<symrec*> <symrec*>
@@ -83,7 +82,7 @@ GNU Bison NEWS
In that case the user must define _() and N_(), and yysymbol_name returns In that case the user must define _() and N_(), and yysymbol_name returns
the translated symbol (i.e., it returns '_("variable")' rather that the translated symbol (i.e., it returns '_("variable")' rather that
'"variable"'). '"variable"'). In Java, the user must provide an i18n() function.
*** List of expected tokens (yacc.c) *** List of expected tokens (yacc.c)
@@ -95,6 +94,38 @@ GNU Bison NEWS
It makes little sense to use this feature without enabling LAC (lookahead It makes little sense to use this feature without enabling LAC (lookahead
correction). correction).
*** Deep overhaul of the symbol and token kinds
To avoid the confusion with typing in programming languages, we now refer
to token and symbol "kinds" instead of token and symbol "types".
**** Token kind
The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
LPAREN, etc. Users are invited to replace their uses of "enum
yytokentype" by "yytoken_kind_t".
This type now also includes tokens that were proviously hidden: YYEOF (end
of input), YYUNDEF (undefined token), and YYERRCODE (error token). They
now have string aliases, internationalized if internationalization is
enabled. Therefore, by default, error messages now refer to "end of file"
(internationalized) rather than the cryptic "$end".
In most case, it is now useless to define the end-of-line token as
follows:
%token EOF 0 _("end of file")
Rather simply use "YYEOF" in your scanner.
**** Symbol kinds
The "symbol kinds" is what the parser actually uses. (Unless the
api.token.raw %define variable was used, the internal symbol kind of a
terminal differs from the corresponding token kind.)
They are now exposed as a enum, "yysymbol_kind_t".
*** Modernize display of explanatory statements in diagnostics *** Modernize display of explanatory statements in diagnostics
Since Bison 2.7, output was indented four spaces for explanatory Since Bison 2.7, output was indented four spaces for explanatory

View File

@@ -1903,7 +1903,7 @@ yylex (void)
@group @group
/* Return end-of-input. */ /* Return end-of-input. */
else if (c == EOF) else if (c == EOF)
return 0; return YYEOF;
/* Return a single char. */ /* Return a single char. */
else else
return c; return c;
@@ -2352,7 +2352,7 @@ yylex (void)
/* Return end-of-input. */ /* Return end-of-input. */
if (c == EOF) if (c == EOF)
return 0; return YYEOF;
@group @group
/* Return a single char, and update location. */ /* Return a single char, and update location. */
@@ -2722,7 +2722,7 @@ yylex (void)
c = getchar (); c = getchar ();
if (c == EOF) if (c == EOF)
return 0; return YYEOF;
@end group @end group
@group @group
@@ -4926,14 +4926,6 @@ would produce in French @samp{erreur de syntaxe, || inattendu, attendait
nombre ou (} rather than @samp{erreur de syntaxe, || inattendu, attendait nombre ou (} rather than @samp{erreur de syntaxe, || inattendu, attendait
number ou (}. number ou (}.
The token numbered as 0 corresponds to the end of file; the following line
allows for nicer error messages referring to ``end of file''
(internationalized) instead of ``$end'':
@example
%token END 0 _("end of file")
@end example
@node Precedence Decl @node Precedence Decl
@subsection Operator Precedence @subsection Operator Precedence
@cindex precedence declarations @cindex precedence declarations
@@ -7812,7 +7804,6 @@ or @code{detailed}, token aliases can be internationalized:
@example @example
%token %token
'\n' _("end of line") '\n' _("end of line")
EOF 0 _("end of file")
<double> <double>
NUM _("double precision number") NUM _("double precision number")
<symrec*> <symrec*>
@@ -7828,17 +7819,26 @@ If at least one token alias is internationalized, then the generated parser
will use both @code{N_} and @code{_}, that must be defined will use both @code{N_} and @code{_}, that must be defined
(@pxref{Programmers, , The Programmers View, gettext, GNU @code{gettext} (@pxref{Programmers, , The Programmers View, gettext, GNU @code{gettext}
utilities}). They are used only on string aliases marked for translation. utilities}). They are used only on string aliases marked for translation.
In other words, even if your catalog features a translation for ``end of In other words, even if your catalog features a translation for
line'', then with ``function'', then with
@example @example
%token %token
'\n' "end of line" <symrec*>
EOF 0 _("end of file") FUN "function"
VAR _("variable")
@end example @end example
@noindent @noindent
``end of line'' will appear untranslated in debug traces and error messages. ``function'' will appear untranslated in debug traces and error messages.
Unless defined by the user, the end-of-file token, @code{YYEOF}, is provided
``end of file'' as an alias. It is also internationalized if the user
internationalized tokens. To map it to another string, use:
@example
%token END 0 _("end of input")
@end example
@node Algorithm @node Algorithm
@@ -11401,17 +11401,7 @@ Symbols}). This directive:
@noindent @noindent
requests that Bison generates the functions @code{make_TEXT} and requests that Bison generates the functions @code{make_TEXT} and
@code{make_NUMBER}. As a matter of fact, it is convenient to have also a @code{make_NUMBER}, but also @code{make_YYEOF}, for the end of input.
symbol to mark the end of input, say @code{END_OF_FILE}:
@comment file: c++/simple.yy: 1
@example
%token END_OF_FILE 0
@end example
@noindent
The @code{0} tells Bison this token is special: when it is reached, parsing
finishes.
Everything is in place for our scanner: Everything is in place for our scanner:
@@ -11441,7 +11431,7 @@ Everything is in place for our scanner:
@end group @end group
@group @group
default: default:
return parser::make_END_OF_FILE (); return parser::make_YYEOF ();
@end group @end group
@} @}
@} @}
@@ -12439,17 +12429,14 @@ file; it needs detailed knowledge about the driver.
@noindent @noindent
The token code 0 corresponds to end of file; the following line User friendly names are provided for each symbol. To avoid name clashes in
allows for nicer error messages referring to ``end of file'' instead of the generated files (@pxref{Calc++ Scanner}), prefix tokens with @code{TOK_}
``$end''. Similarly user friendly names are provided for each symbol. To (@pxref{%define Summary}).
avoid name clashes in the generated files (@pxref{Calc++ Scanner}), prefix
tokens with @code{TOK_} (@pxref{%define Summary}).
@comment file: calc++/parser.yy @comment file: calc++/parser.yy
@example @example
%define api.token.prefix @{TOK_@} %define api.token.prefix @{TOK_@}
%token %token
END 0 "end of file"
ASSIGN ":=" ASSIGN ":="
MINUS "-" MINUS "-"
PLUS "+" PLUS "+"
@@ -12695,7 +12682,7 @@ The rules are simple. The driver is used to report errors.
(loc, "invalid character: " + std::string(yytext)); (loc, "invalid character: " + std::string(yytext));
@} @}
@end group @end group
<<EOF>> return yy::parser::make_END (loc); <<EOF>> return yy::parser::make_YYEOF (loc);
%% %%
@end example @end example