doc: typed mid-rule actions

* doc/bison.texi (Mid-Rule Actions): Restructure to insert...
(Typed Mid-Rule Actions): this new section.
Move the manual translation of mid-rule actions into regular actions
to...
(Mid-Rule Action Translation): here.
This commit is contained in:
Akim Demaille
2018-08-12 10:49:29 +02:00
parent adf0425d11
commit 005ea24cbb

View File

@@ -225,6 +225,7 @@ Defining Language Semantics
Actions in Mid-Rule Actions in Mid-Rule
* Using Mid-Rule Actions:: Putting an action in the middle of a rule. * Using Mid-Rule Actions:: Putting an action in the middle of a rule.
* Typed Mid-Rule Actions:: Specifying the semantic type of their values.
* Mid-Rule Action Translation:: How mid-rule actions are actually processed. * Mid-Rule Action Translation:: How mid-rule actions are actually processed.
* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts. * Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
@@ -4071,6 +4072,7 @@ are executed before the parser even recognizes the following components.
@menu @menu
* Using Mid-Rule Actions:: Putting an action in the middle of a rule. * Using Mid-Rule Actions:: Putting an action in the middle of a rule.
* Typed Mid-Rule Actions:: Specifying the semantic type of their values.
* Mid-Rule Action Translation:: How mid-rule actions are actually processed. * Mid-Rule Action Translation:: How mid-rule actions are actually processed.
* Mid-Rule Conflicts:: Mid-rule actions can cause conflicts. * Mid-Rule Conflicts:: Mid-rule actions can cause conflicts.
@end menu @end menu
@@ -4158,64 +4160,86 @@ earlier action is used to restore the prior list of variables. This
removes the temporary @code{let}-variable from the list so that it won't removes the temporary @code{let}-variable from the list so that it won't
appear to exist while the rest of the program is parsed. appear to exist while the rest of the program is parsed.
Because the types of the semantic values of mid-rule actions are unknown to
Bison, type-based features (e.g., @samp{%printer}, @samp{%destructor}) do
not work, which could result in memory leaks. They also forbid the use of
the @code{variant} implementation of the @code{api.value.type} in C++
(@pxref{C++ Variants}).
@xref{Typed Mid-Rule Actions}, for one way to address this issue, and
@ref{Mid-Rule Action Translation}, for another: turning mid-action actions
into regular actions.
@node Typed Mid-Rule Actions
@subsubsection Typed Mid-Rule Actions
@findex %destructor @findex %destructor
@cindex discarded symbols, mid-rule actions @cindex discarded symbols, mid-rule actions
@cindex error recovery, mid-rule actions @cindex error recovery, mid-rule actions
In the above example, if the parser initiates error recovery (@pxref{Error In the above example, if the parser initiates error recovery (@pxref{Error
Recovery}) while parsing the tokens in the embedded statement @code{stmt}, Recovery}) while parsing the tokens in the embedded statement @code{stmt},
it might discard the previous semantic context @code{$<context>5} without it might discard the previous semantic context @code{$<context>5} without
restoring it. restoring it. Thus, @code{$<context>5} needs a destructor
Thus, @code{$<context>5} needs a destructor (@pxref{Destructor Decl, , Freeing (@pxref{Destructor Decl, , Freeing Discarded Symbols}), and Bison needs the
Discarded Symbols}). type of the semantic value (@code{context}) to select the right destructor.
However, Bison currently provides no means to declare a destructor specific to
a particular mid-rule action's semantic value.
One solution is to bury the mid-rule action inside a nonterminal symbol and to As an extension to Yacc's mid-rule actions, Bison offers a means to type
declare a destructor for that symbol: their semantic value: specify its type tag (@samp{<...>} before the mid-rule
action.
Consider the previous example, with an untyped mid-rule action:
@example @example
@group
%type <context> let
%destructor @{ pop_context ($$); @} let
@end group
%%
@group @group
stmt: stmt:
let stmt
@{
$$ = $2;
pop_context ($let);
@};
@end group
@group
let:
"let" '(' var ')' "let" '(' var ')'
@{ @{
$let = push_context (); $<context>$ = push_context (); // ***
declare_variable ($3); declare_variable ($3);
@}; @}
stmt
@{
$$ = $6;
pop_context ($<context>5); // ***
@}
@end group @end group
@end example @end example
@noindent @noindent
Note that the action is now at the end of its rule. If instead you write:
Any mid-rule action can be converted to an end-of-rule action in this way, and
this is what Bison actually does to implement mid-rule actions. @example
@group
stmt:
"let" '(' var ')'
<context>@{ // ***
$$ = push_context (); // ***
declare_variable ($3);
@}
stmt
@{
$$ = $6;
pop_context ($5); // ***
@}
@end group
@end example
@noindent
then @code{%printer} and @code{%destructor} work properly (no more leaks!),
C++ @code{variant}s can be used, and redundancy is reduced (@code{<context>}
is specified once).
@node Mid-Rule Action Translation @node Mid-Rule Action Translation
@subsubsection Mid-Rule Action Translation @subsubsection Mid-Rule Action Translation
@vindex $@@@var{n} @vindex $@@@var{n}
@vindex @@@var{n} @vindex @@@var{n}
As hinted earlier, mid-rule actions are actually transformed into regular Mid-rule actions are actually transformed into regular rules and actions.
rules and actions. The various reports generated by Bison (textual, The various reports generated by Bison (textual, graphical, etc., see
graphical, etc., see @ref{Understanding, , Understanding Your Parser}) @ref{Understanding, , Understanding Your Parser}) reveal this translation,
reveal this translation, best explained by means of an example. The best explained by means of an example. The following rule:
following rule:
@example @example
exp: @{ a(); @} "b" @{ c(); @} @{ d(); @} "e" @{ f(); @}; exp: @{ a(); @} "b" @{ c(); @} @{ d(); @} "e" @{ f(); @};
@@ -4273,6 +4297,45 @@ mid.y:2.19-31: warning: unused value: $3
@end group @end group
@end example @end example
@sp 1
It is sometimes useful to turn mid-rule actions into regular actions, e.g.,
to factor them, or to escape from their limitations. For instance, as an
alternative to @emph{typed} mid-rule action, you may bury the mid-rule
action inside a nonterminal symbol and to declare a printer and a destructor
for that symbol:
@example
@group
%type <context> let
%destructor @{ pop_context ($$); @} let
%printer @{ print_context (yyo, $$); @} let
@end group
%%
@group
stmt:
let stmt
@{
$$ = $2;
pop_context ($let);
@};
@end group
@group
let:
"let" '(' var ')'
@{
$let = push_context ();
declare_variable ($var);
@};
@end group
@end example
@node Mid-Rule Conflicts @node Mid-Rule Conflicts
@subsubsection Conflicts due to Mid-Rule Actions @subsubsection Conflicts due to Mid-Rule Actions
@@ -10523,7 +10586,7 @@ To enable variant-based semantic values, set @code{%define} variable
@code{%union} is ignored, and instead of using the name of the fields of the @code{%union} is ignored, and instead of using the name of the fields of the
@code{%union} to ``type'' the symbols, use genuine types. @code{%union} to ``type'' the symbols, use genuine types.
For instance, instead of For instance, instead of:
@example @example
%union %union
@@ -10536,7 +10599,7 @@ For instance, instead of
@end example @end example
@noindent @noindent
write write:
@example @example
%token <int> NUMBER; %token <int> NUMBER;
@@ -10555,7 +10618,10 @@ Variants are stricter than unions. When based on unions, you may play any
dirty game with @code{yylval}, say storing an @code{int}, reading a dirty game with @code{yylval}, say storing an @code{int}, reading a
@code{char*}, and then storing a @code{double} in it. This is no longer @code{char*}, and then storing a @code{double} in it. This is no longer
possible with variants: they must be initialized, then assigned to, and possible with variants: they must be initialized, then assigned to, and
eventually, destroyed. eventually, destroyed. As a matter of fact, Bison variants forbid the use
of alternative types such as @samp{$<int>2} or @samp{$<std::string>$}, even
in mid-rule actions. It is mandatory to use typed mid-rule actions
(@pxref{Typed Mid-Rule Actions}).
@deftypemethod {semantic_type} {T&} build<T> () @deftypemethod {semantic_type} {T&} build<T> ()
Initialize, but leave empty. Returns the address where the actual value may Initialize, but leave empty. Returns the address where the actual value may
@@ -10575,10 +10641,13 @@ Boost.Variant not only stores the value, but also a tag specifying its
type. But the parser already ``knows'' the type of the semantic value, so type. But the parser already ``knows'' the type of the semantic value, so
that would be duplicating the information. that would be duplicating the information.
We do not use C++17's @code{std::variant} either: we want to support all the
C++ standards, and of course @code{std::variant} also stores a tag to record
the current type.
Therefore we developed light-weight variants whose type tag is external (so Therefore we developed light-weight variants whose type tag is external (so
they are really like @code{unions} for C++ actually). But our code is much they are really like @code{unions} for C++ actually). There is a number of
less mature that Boost.Variant. So there is a number of limitations in limitations in (the current implementation of) variants:
(the current implementation of) variants:
@itemize @itemize
@item @item
Alignment must be enforced: values should be aligned in memory according to Alignment must be enforced: values should be aligned in memory according to
@@ -10588,6 +10657,9 @@ therefore, since, as far as we know, @code{double} is the most demanding
type on all platforms, alignments are enforced for @code{double} whatever type on all platforms, alignments are enforced for @code{double} whatever
types are actually used. This may waste space in some cases. types are actually used. This may waste space in some cases.
@item
Move semantics is not yet supported, but will soon be added.
@item @item
There might be portability issues we are not aware of. There might be portability issues we are not aware of.
@end itemize @end itemize