api.token.raw: document it

* doc/bison.texi: here.
This commit is contained in:
Akim Demaille
2019-08-31 19:22:32 -05:00
parent 8c18e3f18c
commit 3ca713abd0
3 changed files with 55 additions and 1 deletions

14
NEWS
View File

@@ -21,6 +21,20 @@ GNU Bison NEWS
The C++ deterministic skeleton (lalr1.cc) now supports LAC, via the
%define variable parse.lac.
*** Variable api.token.raw: Optimized token numbers (all skeletons)
In the generated parsers, tokens have two numbers: the "external" token
number as returned by yylex (which starts at 257), and the "internal"
symbol number (which starts at 3). Each time yylex is called, a table
lookup maps the external token number to the internal symbol number.
When the %define variable api.token.raw is set, tokens are assigned their
internal number, which saves one table lookup per token, and also saves
the generation of the mapping table.
The gain is typically moderate, but in extreme cases (very simple user
actions), a 10% improvement can be observed.
*** Debug traces in Java
The Java backend no longer emits code and data for parser tracing if the

6
TODO
View File

@@ -73,7 +73,11 @@ syntax error, unexpected $end, expecting ↦ or 🎅🐃 or '\n'
While at it, we should stop using "$end" by default, in favor of "end of
file", or "end of input", whatever.
file", or "end of input", whatever. See how lalr1.java does that.
** api.token.raw
Maybe we should exhibit the YYUNDEFTOK token. It could also be assigned a
semantic value so that yyerror could be used to report invalid lexemes.
* Bison 3.6
** Unit rules

View File

@@ -6212,6 +6212,42 @@ introduced in Bison 3.0
@c api.token.prefix
@c ================================================== api.token.raw
@deffn Directive {%define api.token.raw}
@itemize @bullet
@item Language(s):
all
@item Purpose:
The output files normally define the tokens with Yacc-compatible token
numbers: sequential numbers starting at 257 except for single character
tokens which stand for themselves (e.g., in ASCII, @samp{'a'} is numbered
65). The parser however uses symbol numbers assigned sequentially starting
at 3. Therefore each time the scanner returns an (external) token number,
it must be mapped to the (internal) symbol number.
When @code{api.token.raw} is set, tokens are assigned their internal number,
which saves one table lookup per token to map them from the external to the
internal number, and also saves the generation of the mapping table. The
gain is typically moderate, but in extreme cases (very simple user actions),
a 10% improvement can be observed.
When @code{api.token.raw} is set, the grammar cannot use character literals
(such as @samp{'a'}).
@item Accepted Values: Boolean.
@item Default Value:
@code{false}
@item History:
introduced in Bison 3.5. Was initialy introduced in Bison 1.25 as
@samp{%raw}, but never worked and was removed in Bison 1.29.
@end itemize
@end deffn
@c api.token.raw
@c ================================================== api.value.automove
@deffn Directive {%define api.value.automove}