doc: token.prefix

* doc/bison.simple (Decl Summary): Document token.prefix.
	(Calc++ Parser): Various fixes.
	Formatting changes.
	Use token.prefix.
	Introduce a macro TOKEN to shorten the code and make it more
	readable.
	(Calc++ Scanner): Adjust.
	* NEWS (Variable token.prefix): New.
This commit is contained in:
Akim Demaille
2009-05-07 09:13:08 +02:00
parent ab2a9f5793
commit 99c08fb662
3 changed files with 99 additions and 27 deletions

View File

@@ -1,3 +1,15 @@
2009-05-11 Akim Demaille <demaille@gostai.com>
doc: token.prefix
* doc/bison.simple (Decl Summary): Document token.prefix.
(Calc++ Parser): Various fixes.
Formatting changes.
Use token.prefix.
Introduce a macro TOKEN to shorten the code and make it more
readable.
(Calc++ Scanner): Adjust.
* NEWS (Variable token.prefix): New.
2009-05-04 Akim Demaille <demaille@gostai.com>
bison: catch bad symbol names.

16
NEWS
View File

@@ -9,6 +9,22 @@ Bison News
Also, it is possible to add code to the parser's constructors using
"%code init" and "%define init_throws".
** Variable token.prefix
The variable token.prefix changes the way tokens are identified in
the generated files. This is especially useful to avoid collisions
with identifiers in the target language. For instance
%token FILE for ERROR
%define token.prefix "TOK_"
%%
start: FILE for ERROR;
will generate the definition of the symbols TOK_FILE, TOK_for, and
TOK_ERROR in the generated sources. In particular, the scanner must
use these prefixed token names, although the grammar itself still
uses the short names (as in the sample rule given above).
* Changes in version 2.5 (????-??-??):
** IELR(1) and Canonical LR(1) Support

View File

@@ -5190,10 +5190,47 @@ is not already defined, so that the debugging facilities are compiled.
@item Default Value: @code{false}
@end itemize
@end table
@c parse.trace
@item token.prefix
@findex %define token.prefix
@itemize
@item Languages(s): all
@item Purpose:
Add a prefix to the token names when generating their definition in the
target language. For instance
@example
%token FILE for ERROR
%define token.prefix "TOK_"
%%
start: FILE for ERROR;
@end example
@noindent
generates the definition of the symbols @code{TOK_FILE}, @code{TOK_for},
and @code{TOK_ERROR} in the generated source files. In particular, the
scanner must use these prefixed token names, while the grammar itself
may still use the short names (as in the sample rule given above). The
generated informational files (@file{*.output}, @file{*.xml},
@file{*.dot}) are not modified by this prefix. See @ref{Calc++ Parser}
and @ref{Calc++ Scanner}, for a complete example.
@item Accepted Values:
Any string. Should be a valid identifier prefix in the target language,
in other words, it should typically be an identifier itself (sequence of
letters, underscores, and ---not at the beginning--- digits).
@item Default Value:
empty
@end itemize
@c token.prefix
@end table
@end deffn
@c %define
@c ---------------------------------------------------------- %define
@deffn {Directive} %defines
Write a header file containing macro definitions for the token type
@@ -8777,13 +8814,14 @@ The code between @samp{%code @{} and @samp{@}} is output in the
@noindent
The token numbered as 0 corresponds to end of file; the following line
allows for nicer error messages referring to ``end of file'' instead
of ``$end''. Similarly user friendly named are provided for each
symbol. Note that the tokens names are prefixed by @code{TOKEN_} to
avoid name clashes.
allows for nicer error messages referring to ``end of file'' instead of
``$end''. Similarly user friendly names are provided for each symbol.
To avoid name clashes in the generated files (@pxref{Calc++ Scanner}),
prefix tokens with @code{TOK_} (@pxref{Decl Summary,, token.prefix}).
@comment file: calc++-parser.yy
@example
%define token.prefix "TOK_"
%token END 0 "end of file"
%token ASSIGN ":="
%token <sval> IDENTIFIER "identifier"
@@ -8813,22 +8851,24 @@ The grammar itself is straightforward.
%start unit;
unit: assignments exp @{ driver.result = $2; @};
assignments: assignments assignment @{@}
| /* Nothing. */ @{@};
assignments:
assignments assignment @{@}
| /* Nothing. */ @{@};
assignment:
"identifier" ":=" exp
"identifier" ":=" exp
@{ driver.variables[*$1] = $3; delete $1; @};
%left '+' '-';
%left '*' '/';
exp: exp '+' exp @{ $$ = $1 + $3; @}
| exp '-' exp @{ $$ = $1 - $3; @}
| exp '*' exp @{ $$ = $1 * $3; @}
| exp '/' exp @{ $$ = $1 / $3; @}
| '(' exp ')' @{ $$ = $2; @}
| "identifier" @{ $$ = driver.variables[*$1]; delete $1; @}
| "number" @{ $$ = $1; @};
exp:
exp '+' exp @{ $$ = $1 + $3; @}
| exp '-' exp @{ $$ = $1 - $3; @}
| exp '*' exp @{ $$ = $1 * $3; @}
| exp '/' exp @{ $$ = $1 / $3; @}
| '(' exp ')' @{ $$ = $2; @}
| "identifier" @{ $$ = driver.variables[*$1]; delete $1; @}
| "number" @{ $$ = $1; @};
%%
@end example
@@ -8869,10 +8909,10 @@ parser's to get the set of defined tokens.
# undef yywrap
# define yywrap() 1
/* By default yylex returns int, we use token_type.
Unfortunately yyterminate by default returns 0, which is
/* By default yylex returns an int; we use token_type.
The default yyterminate implementation returns 0, which is
not of token_type. */
#define yyterminate() return token::END
#define yyterminate() return TOKEN(END)
%@}
@end example
@@ -8920,28 +8960,32 @@ preceding tokens. Comments would be treated equally.
@end example
@noindent
The rules are simple, just note the use of the driver to report errors.
It is convenient to use a typedef to shorten
@code{yy::calcxx_parser::token::identifier} into
@code{token::identifier} for instance.
The rules are simple. The driver is used to report errors. It is
convenient to use a macro to shorten
@code{yy::calcxx_parser::token::TOK_@var{Name}} into
@code{TOKEN(@var{Name})}; note the token prefix, @code{TOK_}.
@comment file: calc++-scanner.ll
@example
%@{
typedef yy::calcxx_parser::token token;
# define TOKEN(Name) \
yy::calcxx_parser::token::TOK_ ## Name
%@}
/* Convert ints to the actual type of tokens. */
[-+*/()] return yy::calcxx_parser::token_type (yytext[0]);
":=" return token::ASSIGN;
":=" return TOKEN(ASSIGN);
@{int@} @{
errno = 0;
long n = strtol (yytext, NULL, 10);
if (! (INT_MIN <= n && n <= INT_MAX && errno != ERANGE))
driver.error (*yylloc, "integer is out of range");
yylval->ival = n;
return token::NUMBER;
return TOKEN(NUMBER);
@}
@{id@} @{
yylval->sval = new std::string (yytext);
return TOKEN(IDENTIFIER);
@}
@{id@} yylval->sval = new std::string (yytext); return token::IDENTIFIER;
. driver.error (*yylloc, "invalid character");
%%
@end example