mirror of
https://git.savannah.gnu.org/git/bison.git
synced 2026-03-21 10:13:03 +00:00
doc: updates for 3.6
* doc/bison.texi: More s/token type/token kind/. * NEWS: Update.
This commit is contained in:
52
NEWS
52
NEWS
@@ -19,7 +19,7 @@ GNU Bison NEWS
|
|||||||
*** Improved syntax error messages
|
*** Improved syntax error messages
|
||||||
|
|
||||||
Two new values for the %define parse.error variable offer more control to
|
Two new values for the %define parse.error variable offer more control to
|
||||||
the user.
|
the user. Available in all the skeletons (C, C++, Java).
|
||||||
|
|
||||||
**** %define parse.error detailed
|
**** %define parse.error detailed
|
||||||
|
|
||||||
@@ -34,7 +34,12 @@ GNU Bison NEWS
|
|||||||
**** %define parse.error custom
|
**** %define parse.error custom
|
||||||
|
|
||||||
With this directive, the user forges and emits the syntax error message
|
With this directive, the user forges and emits the syntax error message
|
||||||
herself by defining a function such as:
|
herself by defining the yyreport_syntax_error function. A new type,
|
||||||
|
yypcontext_t, captures the circumstances of the error, and provides the
|
||||||
|
user with functions to get details, such as yypcontext_expected_tokens to
|
||||||
|
get the list of expected token kinds.
|
||||||
|
|
||||||
|
A possible implementation of yyreport_syntax_error is:
|
||||||
|
|
||||||
int
|
int
|
||||||
yyreport_syntax_error (const yypcontext_t *ctx)
|
yyreport_syntax_error (const yypcontext_t *ctx)
|
||||||
@@ -86,35 +91,42 @@ GNU Bison NEWS
|
|||||||
|
|
||||||
*** List of expected tokens (yacc.c)
|
*** List of expected tokens (yacc.c)
|
||||||
|
|
||||||
At any point during parsing (including even before submitting the first
|
Push parsers may invoke yypstate_expected_tokens at any point during
|
||||||
token), push parsers may now invoke yypstate_expected_tokens to get the
|
parsing (including even before submitting the first token) to get the list
|
||||||
list of possible tokens. This feature can be used to propose
|
of possible tokens. This feature can be used to propose autocompletion
|
||||||
autocompletion (see below the "bistromathic" example).
|
(see below the "bistromathic" example).
|
||||||
|
|
||||||
It makes little sense to use this feature without enabling LAC (lookahead
|
It makes little sense to use this feature without enabling LAC (lookahead
|
||||||
correction).
|
correction).
|
||||||
|
|
||||||
*** Deep overhaul of the symbol and token kinds
|
*** Deep overhaul of the symbol and token kinds
|
||||||
|
|
||||||
To avoid the confusion with typing in programming languages, we now refer
|
To avoid the confusion with types in programming languages, we now refer
|
||||||
to token and symbol "kinds" instead of token and symbol "types".
|
to token and symbol "kinds" instead of token and symbol "types". The
|
||||||
|
documentation and error messages have been revised.
|
||||||
|
|
||||||
|
All the skeletons have been updated to use dedicated enum types rather
|
||||||
|
than integral types. Special symbols are now regular citizens, instead of
|
||||||
|
being declared in ad hoc ways.
|
||||||
|
|
||||||
**** Token kinds
|
**** Token kinds
|
||||||
|
|
||||||
The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
|
The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
|
||||||
LPAREN, etc. Users are invited to replace their uses of "enum
|
LPAREN, etc. While backward compatibility is of course ensured, users are
|
||||||
yytokentype" by "yytoken_kind_t".
|
nonetheless invited to replace their uses of "enum yytokentype" by
|
||||||
|
"yytoken_kind_t".
|
||||||
|
|
||||||
This type now also includes tokens that were previously hidden: YYEOF (end
|
This type now also includes tokens that were previously hidden: YYEOF (end
|
||||||
of input), YYUNDEF (undefined token), and YYERRCODE (error token). They
|
of input), YYUNDEF (undefined token), and YYERRCODE (error token). They
|
||||||
now have string aliases, internationalized if internationalization is
|
now have string aliases, internationalized when internationalization is
|
||||||
enabled. Therefore, by default, error messages now refer to "end of file"
|
enabled. Therefore, by default, error messages now refer to "end of file"
|
||||||
(internationalized) rather than the cryptic "$end".
|
(internationalized) rather than the cryptic "$end", or to "invaid token"
|
||||||
|
rather than "$undefined".
|
||||||
|
|
||||||
In most case, it is now useless to define the end-of-line token as
|
Therefore in most cases it is now useless to define the end-of-line token
|
||||||
follows:
|
as follows:
|
||||||
|
|
||||||
%token EOF 0 _("end of file")
|
%token T_EOF 0 "end of file"
|
||||||
|
|
||||||
Rather simply use "YYEOF" in your scanner.
|
Rather simply use "YYEOF" in your scanner.
|
||||||
|
|
||||||
@@ -126,7 +138,9 @@ GNU Bison NEWS
|
|||||||
|
|
||||||
They are now exposed as a enum, "yysymbol_kind_t".
|
They are now exposed as a enum, "yysymbol_kind_t".
|
||||||
|
|
||||||
This allows users to tailor the error messages the way they want.
|
This allows users to tailor the error messages the way they want, or to
|
||||||
|
process some symbols in a specific way in autocompletion (see the
|
||||||
|
bistromathic example below).
|
||||||
|
|
||||||
*** Modernize display of explanatory statements in diagnostics
|
*** Modernize display of explanatory statements in diagnostics
|
||||||
|
|
||||||
@@ -166,12 +180,18 @@ GNU Bison NEWS
|
|||||||
The lexcalc example (a simple example in C based on Flex and Bison) now
|
The lexcalc example (a simple example in C based on Flex and Bison) now
|
||||||
also demonstrates location tracking.
|
also demonstrates location tracking.
|
||||||
|
|
||||||
|
|
||||||
A new C example, bistromathic, is a fully featured interactive calculator
|
A new C example, bistromathic, is a fully featured interactive calculator
|
||||||
using many Bison features: pure interface, push parser, autocompletion
|
using many Bison features: pure interface, push parser, autocompletion
|
||||||
based on the current parser state (using yypstate_expected_tokens),
|
based on the current parser state (using yypstate_expected_tokens),
|
||||||
location tracking, internationalized custom error messages, lookahead
|
location tracking, internationalized custom error messages, lookahead
|
||||||
correction, rich debug traces, etc.
|
correction, rich debug traces, etc.
|
||||||
|
|
||||||
|
It shows how to depend on the symbol kinds to tailor autocompletion. For
|
||||||
|
instance it recognizes the symbol kind "VARIABLE" to propose
|
||||||
|
autocompletion on the existing variables, rather than of the word
|
||||||
|
"variable".
|
||||||
|
|
||||||
* Noteworthy changes in release 3.5.4 (2020-04-05) [stable]
|
* Noteworthy changes in release 3.5.4 (2020-04-05) [stable]
|
||||||
|
|
||||||
** WARNING: Future backward-incompatibilities!
|
** WARNING: Future backward-incompatibilities!
|
||||||
|
|||||||
9
TODO
9
TODO
@@ -19,12 +19,11 @@
|
|||||||
- symbol.type_get should be kind_get, and it's not documented.
|
- symbol.type_get should be kind_get, and it's not documented.
|
||||||
- YYERRCODE and "end of file" and translation
|
- YYERRCODE and "end of file" and translation
|
||||||
|
|
||||||
*** The documentation
|
** Java
|
||||||
You can explicitly specify the numeric code for a token type...
|
*** Examples
|
||||||
|
Have an example with a push parser. Use autocompletion in that case.
|
||||||
|
|
||||||
The token numbered as 0.
|
*** calc.at
|
||||||
|
|
||||||
** Java: calc.at
|
|
||||||
Stop hard-coding "Calc". Adjust local.at (look for FIXME).
|
Stop hard-coding "Calc". Adjust local.at (look for FIXME).
|
||||||
|
|
||||||
** doc
|
** doc
|
||||||
|
|||||||
@@ -1232,7 +1232,7 @@ action in a GLR parser.
|
|||||||
@cindex GLR parsers and @code{yylval}
|
@cindex GLR parsers and @code{yylval}
|
||||||
@vindex yylloc
|
@vindex yylloc
|
||||||
@cindex GLR parsers and @code{yylloc}
|
@cindex GLR parsers and @code{yylloc}
|
||||||
In any semantic action, you can examine @code{yychar} to determine the type
|
In any semantic action, you can examine @code{yychar} to determine the kind
|
||||||
of the lookahead token present at the time of the associated reduction.
|
of the lookahead token present at the time of the associated reduction.
|
||||||
After checking that @code{yychar} is not set to @code{YYEMPTY} or
|
After checking that @code{yychar} is not set to @code{YYEMPTY} or
|
||||||
@code{YYEOF}, you can then examine @code{yylval} and @code{yylloc} to
|
@code{YYEOF}, you can then examine @code{yylval} and @code{yylloc} to
|
||||||
@@ -1853,7 +1853,7 @@ for such a single-character token is the character itself.
|
|||||||
|
|
||||||
The return value of the lexical analyzer function is a numeric code which
|
The return value of the lexical analyzer function is a numeric code which
|
||||||
represents a token kind. The same text used in Bison rules to stand for
|
represents a token kind. The same text used in Bison rules to stand for
|
||||||
this token kind is also a C expression for the numeric code for the type.
|
this token kind is also a C expression for the numeric code of the kind.
|
||||||
This works in two ways. If the token kind is a character literal, then its
|
This works in two ways. If the token kind is a character literal, then its
|
||||||
numeric code is that of the character; you can use the same character
|
numeric code is that of the character; you can use the same character
|
||||||
literal in the lexical analyzer to express the number. If the token kind is
|
literal in the lexical analyzer to express the number. If the token kind is
|
||||||
@@ -2230,14 +2230,13 @@ the same as the declarations for the infix notation calculator.
|
|||||||
@end example
|
@end example
|
||||||
|
|
||||||
@noindent
|
@noindent
|
||||||
Note there are no declarations specific to locations. Defining a data
|
Note there are no declarations specific to locations. Defining a data type
|
||||||
type for storing locations is not needed: we will use the type provided
|
for storing locations is not needed: we will use the type provided by
|
||||||
by default (@pxref{Location Type}), which is a
|
default (@pxref{Location Type}), which is a four member structure with the
|
||||||
four member structure with the following integer fields:
|
following integer fields: @code{first_line}, @code{first_column},
|
||||||
@code{first_line}, @code{first_column}, @code{last_line} and
|
@code{last_line} and @code{last_column}. By conventions, and in accordance
|
||||||
@code{last_column}. By conventions, and in accordance with the GNU
|
with the GNU Coding Standards and common practice, the line and column count
|
||||||
Coding Standards and common practice, the line and column count both
|
both start at 1.
|
||||||
start at 1.
|
|
||||||
|
|
||||||
@node Ltcalc Rules
|
@node Ltcalc Rules
|
||||||
@subsection Grammar Rules for @code{ltcalc}
|
@subsection Grammar Rules for @code{ltcalc}
|
||||||
@@ -2646,7 +2645,7 @@ By simply editing the initialization list and adding the necessary include
|
|||||||
files, you can add additional functions to the calculator.
|
files, you can add additional functions to the calculator.
|
||||||
|
|
||||||
Two important functions allow look-up and installation of symbols in the
|
Two important functions allow look-up and installation of symbols in the
|
||||||
symbol table. The function @code{putsym} is passed a name and the type
|
symbol table. The function @code{putsym} is passed a name and the kind
|
||||||
(@code{VAR} or @code{FUN}) of the object to be installed. The object is
|
(@code{VAR} or @code{FUN}) of the object to be installed. The object is
|
||||||
linked to the front of the list, and a pointer to the object is returned.
|
linked to the front of the list, and a pointer to the object is returned.
|
||||||
The function @code{getsym} is passed the name of the symbol to look up. If
|
The function @code{getsym} is passed the name of the symbol to look up. If
|
||||||
@@ -3698,10 +3697,9 @@ In a simple program it may be sufficient to use the same data type for
|
|||||||
the semantic values of all language constructs. This was true in the
|
the semantic values of all language constructs. This was true in the
|
||||||
RPN and infix calculator examples (@pxref{RPN Calc}).
|
RPN and infix calculator examples (@pxref{RPN Calc}).
|
||||||
|
|
||||||
Bison normally uses the type @code{int} for semantic values if your
|
Bison normally uses the type @code{int} for semantic values if your program
|
||||||
program uses the same data type for all language constructs. To
|
uses the same data type for all language constructs. To specify some other
|
||||||
specify some other type, define the @code{%define} variable
|
type, define the @code{%define} variable @code{api.value.type} like this:
|
||||||
@code{api.value.type} like this:
|
|
||||||
|
|
||||||
@example
|
@example
|
||||||
%define api.value.type @{double@}
|
%define api.value.type @{double@}
|
||||||
@@ -4492,10 +4490,9 @@ Defining a data type for locations is much simpler than for semantic values,
|
|||||||
since all tokens and groupings always use the same type.
|
since all tokens and groupings always use the same type.
|
||||||
|
|
||||||
You can specify the type of locations by defining a macro called
|
You can specify the type of locations by defining a macro called
|
||||||
@code{YYLTYPE}, just as you can specify the semantic value type by
|
@code{YYLTYPE}, just as you can specify the semantic value type by defining
|
||||||
defining a @code{YYSTYPE} macro (@pxref{Value Type}).
|
a @code{YYSTYPE} macro (@pxref{Value Type}). When @code{YYLTYPE} is not
|
||||||
When @code{YYLTYPE} is not defined, Bison uses a default structure type with
|
defined, Bison uses a default structure type with four members:
|
||||||
four members:
|
|
||||||
|
|
||||||
@example
|
@example
|
||||||
typedef struct YYLTYPE
|
typedef struct YYLTYPE
|
||||||
@@ -7161,7 +7158,7 @@ yylex (void)
|
|||||||
return c; /* Assume token kind for '+' is '+'. */
|
return c; /* Assume token kind for '+' is '+'. */
|
||||||
@dots{}
|
@dots{}
|
||||||
else
|
else
|
||||||
return INT; /* Return the type of the token. */
|
return INT; /* Return the kind of the token. */
|
||||||
@dots{}
|
@dots{}
|
||||||
@}
|
@}
|
||||||
@end example
|
@end example
|
||||||
@@ -7211,7 +7208,7 @@ the type is @code{int} (the default), you might write this in @code{yylex}:
|
|||||||
@group
|
@group
|
||||||
@dots{}
|
@dots{}
|
||||||
yylval = value; /* Put value onto Bison stack. */
|
yylval = value; /* Put value onto Bison stack. */
|
||||||
return INT; /* Return the type of the token. */
|
return INT; /* Return the kind of the token. */
|
||||||
@dots{}
|
@dots{}
|
||||||
@end group
|
@end group
|
||||||
@end example
|
@end example
|
||||||
@@ -7238,7 +7235,7 @@ then the code in @code{yylex} might look like this:
|
|||||||
@group
|
@group
|
||||||
@dots{}
|
@dots{}
|
||||||
yylval.intval = value; /* Put value onto Bison stack. */
|
yylval.intval = value; /* Put value onto Bison stack. */
|
||||||
return INT; /* Return the type of the token. */
|
return INT; /* Return the kind of the token. */
|
||||||
@dots{}
|
@dots{}
|
||||||
@end group
|
@end group
|
||||||
@end example
|
@end example
|
||||||
@@ -7279,7 +7276,7 @@ yylex (YYSTYPE *lvalp, YYLTYPE *llocp)
|
|||||||
@{
|
@{
|
||||||
@dots{}
|
@dots{}
|
||||||
*lvalp = value; /* Put value onto Bison stack. */
|
*lvalp = value; /* Put value onto Bison stack. */
|
||||||
return INT; /* Return the type of the token. */
|
return INT; /* Return the kind of the token. */
|
||||||
@dots{}
|
@dots{}
|
||||||
@}
|
@}
|
||||||
@end example
|
@end example
|
||||||
@@ -8383,15 +8380,14 @@ represent the entire sequence of terminal and nonterminal symbols at or
|
|||||||
near the top of the stack. The current state collects all the information
|
near the top of the stack. The current state collects all the information
|
||||||
about previous input which is relevant to deciding what to do next.
|
about previous input which is relevant to deciding what to do next.
|
||||||
|
|
||||||
Each time a lookahead token is read, the current parser state together
|
Each time a lookahead token is read, the current parser state together with
|
||||||
with the type of lookahead token are looked up in a table. This table
|
the kind of lookahead token are looked up in a table. This table entry can
|
||||||
entry can say, ``Shift the lookahead token.'' In this case, it also
|
say, ``Shift the lookahead token.'' In this case, it also specifies the new
|
||||||
specifies the new parser state, which is pushed onto the top of the
|
parser state, which is pushed onto the top of the parser stack. Or it can
|
||||||
parser stack. Or it can say, ``Reduce using rule number @var{n}.''
|
say, ``Reduce using rule number @var{n}.'' This means that a certain number
|
||||||
This means that a certain number of tokens or groupings are taken off
|
of tokens or groupings are taken off the top of the stack, and replaced by
|
||||||
the top of the stack, and replaced by one grouping. In other words,
|
one grouping. In other words, that number of states are popped from the
|
||||||
that number of states are popped from the stack, and one new state is
|
stack, and one new state is pushed.
|
||||||
pushed.
|
|
||||||
|
|
||||||
There is one other alternative: the table can say that the lookahead token
|
There is one other alternative: the table can say that the lookahead token
|
||||||
is erroneous in the current state. This causes error processing to begin
|
is erroneous in the current state. This causes error processing to begin
|
||||||
@@ -11624,8 +11620,8 @@ particular it produces a genuine @code{union}, which have a few specific
|
|||||||
features in C++.
|
features in C++.
|
||||||
@itemize @minus
|
@itemize @minus
|
||||||
@item
|
@item
|
||||||
The type @code{YYSTYPE} is defined but its use is discouraged: rather
|
The type @code{YYSTYPE} is defined but its use is discouraged: rather you
|
||||||
you should refer to the parser's encapsulated type
|
should refer to the parser's encapsulated type
|
||||||
@code{yy::parser::semantic_type}.
|
@code{yy::parser::semantic_type}.
|
||||||
@item
|
@item
|
||||||
Non POD (Plain Old Data) types cannot be used. C++98 forbids any instance
|
Non POD (Plain Old Data) types cannot be used. C++98 forbids any instance
|
||||||
|
|||||||
Reference in New Issue
Block a user