c++: exhibit a safe symbol_type

Instead of introducing make_symbol (whose name, btw, somewhat
infringes on the user's "name space", if she defines a token named
"symbol"), let's make the construction of symbol_type safer, using
assertions.

For instance with:

    %token ':' <std::string> ID <int> INT;

generate:

    symbol_type (int token, const std::string&);
    symbol_type (int token, const int&);
    symbol_type (int token);

It does mean that now named token constructors (make_ID, make_INT,
etc.) go through a useless assert, but I think we can ignore this: I
assume any decent compiler will inline the symbol_type ctor inside the
make_TOKEN functions, which will show that the assert is trivially
verified, hence I expect no code will be emitted for it.  And anyway,
that's an assert, NDEBUG controls it.

* data/c++.m4 (symbol_type): Turn into a subclass of
basic_symbol<by_type>.
Declare symbol constructors when variants are enabled.
* data/variant.hh (_b4_type_constructor_declare)
(_b4_type_constructor_define): Replace with...
(_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
Generate symbol_type constructors.
* doc/bison.texi (Complete Symbols): Document.
* tests/types.at: Check.
This commit is contained in:
Akim Demaille
2018-12-19 17:51:10 +01:00
parent 1f4dd2671a
commit e5780041b9
5 changed files with 142 additions and 43 deletions

View File

@@ -11500,6 +11500,57 @@ additional arguments.
For each token type, Bison generates named constructors as follows.
@deftypeop {Constructor} {parser::symbol_type} {} {symbol_type} (int @var{token}, const @var{value_type}& @var{value}, const location_type& @var{location})
@deftypeopx {Constructor} {parser::symbol_type} {} {symbol_type} (int @var{token}, const location_type& @var{location})
@deftypeopx {Constructor} {parser::symbol_type} {} {symbol_type} (int @var{token}, const @var{value_type}& @var{value})
@deftypeopx {Constructor} {parser::symbol_type} {} {symbol_type} (int @var{token})
Build a complete terminal symbol for the token type @var{token} (including
the @code{api.token.prefix}), whose semantic value, if it has one, is
@var{value} of adequate @var{value_type}. Pass the @var{location} iff
location tracking is enabled.
Consistency between @var{token} and @var{value_type} is checked via an
@code{assert}.
@end deftypeop
For instance, given the following declarations:
@example
%define api.token.prefix @{TOK_@}
%token <std::string> IDENTIFIER;
%token <int> INTEGER;
%token ':';
@end example
@noindent
you may use these constructors:
@example
symbol_type (int token, const std::string&, const location_type&);
symbol_type (int token, const int&, const location_type&);
symbol_type (int token, const location_type&);
@end example
@noindent
which should be used in a Flex-scanner as follows.
@example
%%
[a-z]+ return yy::parser::symbol_type (TOK_IDENTIFIER, yytext, loc);
[0-9]+ return yy::parser::symbol_type (TOK_INTEGER, text_to_int (yytext), loc);
":" return yy::parser::symbol_type (':', loc);
<<EOF>> return yy::parser::symbol_type (0, loc);
@end example
@sp 1
Note that it is possible to generate and compile type incorrect code
(e.g. @samp{symbol_type (':', yytext, loc)}). It will fail at run time,
provided the assertions are enabled (i.e., @option{-DNDEBUG} was not passed
to the compiler). Bison supports an alternative that guarantees that type
incorrect code will not even compile. Indeed, it generates @emph{named
constructors} as follows.
@deftypemethod {parser} {symbol_type} {make_@var{token}} (const @var{value_type}& @var{value}, const location_type& @var{location})
@deftypemethodx {parser} {symbol_type} {make_@var{token}} (const location_type& @var{location})
@deftypemethodx {parser} {symbol_type} {make_@var{token}} (const @var{value_type}& @var{value})
@@ -11531,7 +11582,7 @@ symbol_type make_EOF (const location_type&);
@end example
@noindent
which should be used in a Flex-scanner as follows.
which should be used in a scanner as follows.
@example
[a-z]+ return yy::parser::make_IDENTIFIER (yytext, loc);
@@ -11544,6 +11595,7 @@ Tokens that do not have an identifier are not accessible: you cannot simply
use characters such as @code{':'}, they must be declared with @code{%token},
including the end-of-file token.
@node A Complete C++ Example
@subsection A Complete C++ Example