multistart: use b4_accept instead of action post-processing

For each start symbol, generate a parsing function with a richer
return value than the usual of yyparse.  Reserve a place for the
returned semantic value, in order to avoid having to pass a pointer as
argument to "return" that value.  This also makes the call to the
parsing function independent of whether a given start-symbol is typed.

For instance, if the grammar file contains:

    %type <int> expression
    %start input expression

(so "input" is valueless) we get

    typedef struct
    {
      int yystatus;
    } yyparse_input_t;

    yyparse_input_t yyparse_input (void);

    typedef struct
    {
      int yyvalue;
      int yystatus;
    } yyparse_expression_t;

    yyparse_expression_t yyparse_expression (void);

This commit also changes the implementation of the parser termination:
when there are multiple start symbols, it is the initial rules that
explicitly YYACCEPT.  They do that after having exported the
start-symbol's value (if it is typed):

  switch (yyn)
    {
  case 1: /* $accept: YY_EXPRESSION expression $end  */
  { ((*yyvalue).TOK_expression) = (yyvsp[-1].TOK_expression); YYACCEPT; }
    break;

  case 2: /* $accept: YY_INPUT input $end  */
  { YYACCEPT; }
    break;

I have tried several ways to deal with termination, and this is the
one that appears the best one to me.  It is also the most natural.

* src/scan-code.h, src/scan-code.l (obstack_for_actions): New.
* src/reader.c (grammar_rule_check_and_complete): Generate the actions
of the rules for each start symbol.

* data/skeletons/bison.m4 (b4_symbol_slot): New, with safer semantics
than type and type_tag.
* data/skeletons/yacc.c (b4_accept): New.
Generates the body of the action of the start rules.
(_b4_declare_sub_yyparse): For each start symbol define a dedicated
return type for its parsing function.
Adjust the declaration of its parsing function.
(_b4_define_sub_yyparse): Adjust the definition of the function.

* examples/c/lexcalc/parse.y: Check the case of valueless symbols.
* examples/c/lexcalc/lexcalc.test: Check start symbols.
This commit is contained in:
Akim Demaille
2020-07-05 08:00:20 +02:00
parent a6805bb8d9
commit d9cf99b6a5
10 changed files with 146 additions and 43 deletions

View File

@@ -142,11 +142,17 @@ The macro `b4_symbol(NUM, FIELD)` gives access to the following FIELDS:
When api.value.type=union, the generated name for the union member.
yytype_INT etc. for symbols that has_id, otherwise yytype_1 etc.
- `type`
- `type`: string
If it has a semantic value, its type tag, or, if variant are used,
its type.
In the case of api.value.type=union, type is the real type (e.g. int).
- `slot`: string
If it has a semantic value, the name of the union member (i.e., bounces to
either `type_tag` or `type`). It would be better to fix our mess and
always use `type` for the true type of the member, and `type_tag` for the
name of the union member.
- `has_printer`: 0, 1
- `printer`: string
- `printer_file`: string