Extract the parsing of user actions from the grammar scanner.

As a consequence, the relation between the grammar scanner and parser is much simpler. We can also split "composite tokens" back into simple tokens. * src/gram.h (ITEM_NUMBER_MAX, RULE_NUMBER_MAX): New. * src/scan-gram.l (add_column_width, adjust_location): Move to and rename as... * src/location.h, src/location.c (add_column_width) (location_compute): these. Fix the column count: the initial column is 0. (location_print): Be robust to ending column being 0. * src/location.h (boundary_set): New. * src/main.c: Adjust to scanner_free being renamed as gram_scanner_free. * src/output.c: Include scan-code.h. * src/parse-gram.y: Include scan-gram.h and scan-code.h. Use boundary_set. (PERCENT_DESTRUCTOR, PERCENT_PRINTER, PERCENT_INITIAL_ACTION) (PERCENT_LEX_PARAM, PERCENT_PARSE_PARAM): Remove the {...} part, which is now, again, a separate token. Adjust all dependencies. Whereever actions with $ and @ are used, use translate_code. (action): Remove this nonterminal which is now useless. * src/reader.c: Include assert.h, scan-gram.h and scan-code.h. (grammar_current_rule_action_append): Use translate_code. (packgram): Bound check ruleno, itemno, and rule_length. * src/reader.h (gram_in, gram__flex_debug, scanner_cursor) (last_string, last_braced_code_loc, max_left_semantic_context) (scanner_initialize, scanner_free, scanner_last_string_free) (gram_out, gram_lineno, YY_DECL_): Move to... * src/scan-gram.h: this new file. (YY_DECL): Rename as... (GRAM_DECL): this. * src/scan-code.h, src/scan-code.l, src/scan-code-c.c: New. * src/scan-gram.l (gram_get_lineno, gram_get_in, gram_get_out): (gram_get_leng, gram_get_text, gram_set_lineno, gram_set_in): (gram_set_out, gram_get_debug, gram_set_debug, gram_lex_destroy): Move these declarations, and... (obstack_for_string, STRING_GROW, STRING_FINISH, STRING_FREE): these to... * src/flex-scanner.h: this new file. * src/scan-gram.l (rule_length, rule_length_overflow) (increment_rule_length): Remove. (last_braced_code_loc): Rename as... (gram_last_braced_code_loc): this. Adjust to the changes of the parser. Move all the handling of $ and @ into... * src/scan-code.l: here. * src/scan-gram.l (handle_dollar, handle_at): Remove. (handle_action_dollar, handle_action_at): Move to... * src/scan-code.l: here. * src/Makefile.am (bison_SOURCES): Add flex-scanner.h, scan-code.h, scan-code-c.c, scan-gram.h. (EXTRA_bison_SOURCES): Add scan-code.l. (BUILT_SOURCES): Add scan-code.c. (yacc): Be robust to white spaces. * tests/conflicts.at, tests/input.at, tests/reduce.at, * tests/regression.at: Adjust the column numbers. * tests/regression.at: Adjust the error message.
2026-07-26 19:00:33 +00:00 · 2006-06-06 16:40:06 +00:00
parent 184e42f065
commit e9071366c3
21 changed files with 1857 additions and 776 deletions
@@ -1,4 +1,65 @@
-$Id$
+2006-06-06  Akim Demaille  <[email protected]>
+
+	Extract the parsing of user actions from the grammar scanner.
+	As a consequence, the relation between the grammar scanner and
+	parser is much simpler.  We can also split "composite tokens" back
+	into simple tokens.
+	* src/gram.h (ITEM_NUMBER_MAX, RULE_NUMBER_MAX): New.
+	* src/scan-gram.l (add_column_width, adjust_location): Move to and
+	rename as...
+	* src/location.h, src/location.c (add_column_width)
+	(location_compute): these.
+	Fix the column count: the initial column is 0.
+	(location_print): Be robust to ending column being 0.
+	* src/location.h (boundary_set): New.
+	* src/main.c: Adjust to scanner_free being renamed as
+	gram_scanner_free.
+	* src/output.c: Include scan-code.h.
+	* src/parse-gram.y: Include scan-gram.h and scan-code.h.
+	Use boundary_set.
+	(PERCENT_DESTRUCTOR, PERCENT_PRINTER, PERCENT_INITIAL_ACTION)
+	(PERCENT_LEX_PARAM, PERCENT_PARSE_PARAM): Remove the {...} part,
+	which is now, again, a separate token.
+	Adjust all dependencies.
+	Whereever actions with $ and @ are used, use translate_code.
+	(action): Remove this nonterminal which is now useless.
+	* src/reader.c: Include assert.h, scan-gram.h and scan-code.h.
+	(grammar_current_rule_action_append): Use translate_code.
+	(packgram): Bound check ruleno, itemno, and rule_length.
+	* src/reader.h (gram_in, gram__flex_debug, scanner_cursor)
+	(last_string, last_braced_code_loc, max_left_semantic_context)
+	(scanner_initialize, scanner_free, scanner_last_string_free)
+	(gram_out, gram_lineno, YY_DECL_): Move to...
+	* src/scan-gram.h: this new file.
+	(YY_DECL): Rename as...
+	(GRAM_DECL): this.
+	* src/scan-code.h, src/scan-code.l, src/scan-code-c.c: New.
+	* src/scan-gram.l (gram_get_lineno, gram_get_in, gram_get_out):
+	(gram_get_leng, gram_get_text, gram_set_lineno, gram_set_in):
+	(gram_set_out, gram_get_debug, gram_set_debug, gram_lex_destroy):
+	Move these declarations, and...
+	(obstack_for_string, STRING_GROW, STRING_FINISH, STRING_FREE):
+	these to...
+	* src/flex-scanner.h: this new file.
+	* src/scan-gram.l (rule_length, rule_length_overflow)
+	(increment_rule_length): Remove.
+	(last_braced_code_loc): Rename as...
+	(gram_last_braced_code_loc): this.
+	Adjust to the changes of the parser.
+	Move all the handling of $ and @ into...
+	* src/scan-code.l: here.
+	* src/scan-gram.l (handle_dollar, handle_at): Remove.
+	(handle_action_dollar, handle_action_at): Move to...
+	* src/scan-code.l: here.
+	* src/Makefile.am (bison_SOURCES): Add flex-scanner.h,
+	scan-code.h, scan-code-c.c, scan-gram.h.
+	(EXTRA_bison_SOURCES): Add scan-code.l.
+	(BUILT_SOURCES): Add scan-code.c.
+	(yacc): Be robust to white spaces.
+
+	* tests/conflicts.at, tests/input.at, tests/reduce.at,
+	* tests/regression.at: Adjust the column numbers.
+	* tests/regression.at: Adjust the error message.

 2006-06-06  Joel E. Denny  <[email protected]>

@@ -16057,3 +16118,5 @@ $Id$
 	Copying and distribution of this file, with or without
 	modification, are permitted provided the copyright notice and this
 	notice are preserved.
+
+$Id$
@@ -1,4 +1,4 @@
-## Copyright (C) 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
+## Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006 Free Software Foundation, Inc.

 ## This program is free software; you can redistribute it and/or modify
 ## it under the terms of the GNU General Public License as published by
@@ -39,6 +39,7 @@ bison_SOURCES =					  \
 	conflicts.c conflicts.h			  \
 	derives.c derives.h			  \
 	files.c files.h				  \
+	flex-scanner.h				  \
 	getargs.c getargs.h			  \
 	gram.c gram.h				  \
 	lalr.h lalr.c				  \
@@ -54,8 +55,9 @@ bison_SOURCES =					  \
 	reduce.c reduce.h			  \
 	revision.c revision.h			  \
 	relation.c relation.h			  \
-	scan-gram-c.c				  \
-	scan-skel-c.c scan-skel.h		  \
+	scan-code.h scan-code-c.c		  \
+	scan-gram.h scan-gram-c.c		  \
+	scan-skel.h scan-skel-c.c		  \
 	state.c state.h				  \
 	symlist.c symlist.h			  \
 	symtab.c symtab.h			  \
@@ -65,15 +67,20 @@ bison_SOURCES =					  \
 	vcg.c vcg.h				  \
 	vcg_defaults.h

-EXTRA_bison_SOURCES = scan-skel.l scan-gram.l
+EXTRA_bison_SOURCES = scan-code.l scan-skel.l scan-gram.l

-BUILT_SOURCES = revision.c scan-skel.c scan-gram.c parse-gram.c parse-gram.h
+BUILT_SOURCES =					\
+parse-gram.c parse-gram.h			\
+revision.c					\
+scan-code.c					\
+scan-skel.c					\
+scan-gram.c 					\

 MOSTLYCLEANFILES = yacc

 yacc:
 	echo '#! /bin/sh' >$@
-	echo 'exec $(bindir)/bison -y "$$@"' >>$@
+	echo "exec '$(bindir)/bison' -y \"$$@\"" >>$@
 	chmod a+x $@

 echo:
@@ -1,6 +1,6 @@
 /* Data definitions for internal representation of Bison's input.

-   Copyright (C) 1984, 1986, 1989, 1992, 2001, 2002, 2003, 2004, 2005
+   Copyright (C) 1984, 1986, 1989, 1992, 2001, 2002, 2003, 2004, 2005, 2006
   Free Software Foundation, Inc.

   This file is part of Bison, the GNU Compiler Compiler.
@@ -115,6 +115,7 @@ extern int ntokens;
 extern int nvars;

 typedef int item_number;
+#define ITEM_NUMBER_MAX INT_MAX
 extern item_number *ritem;
 extern unsigned int nritems;

@@ -146,6 +147,7 @@ item_number_is_symbol_number (item_number i)

 /* Rule numbers.  */
 typedef int rule_number;
+#define RULE_NUMBER_MAX INT_MAX
 extern rule_number nrules;

 static inline item_number
@@ -1,6 +1,5 @@
 /* Locations for Bison
-
-   Copyright (C) 2002, 2005 Free Software Foundation, Inc.
+   Copyright (C) 2002, 2005, 2006 Free Software Foundation, Inc.

   This file is part of Bison, the GNU Compiler Compiler.

@@ -28,11 +27,80 @@

 location const empty_location;

+/* If BUF is null, add BUFSIZE (which in this case must be less than
+   INT_MAX) to COLUMN; otherwise, add mbsnwidth (BUF, BUFSIZE, 0) to
+   COLUMN.  If an overflow occurs, or might occur but is undetectable,
+   return INT_MAX.  Assume COLUMN is nonnegative.  */
+
+static inline int
+add_column_width (int column, char const *buf, size_t bufsize)
+{
+  size_t width;
+  unsigned int remaining_columns = INT_MAX - column;
+
+  if (buf)
+    {
+      if (INT_MAX / 2 <= bufsize)
+	return INT_MAX;
+      width = mbsnwidth (buf, bufsize, 0);
+    }
+  else
+    width = bufsize;
+
+  return width <= remaining_columns ? column + width : INT_MAX;
+}
+
+/* Set *LOC and adjust scanner cursor to account for token TOKEN of
+   size SIZE.  */
+
+void
+location_compute (location *loc, boundary *cur, char const *token, size_t size)
+{
+  int line = cur->line;
+  int column = cur->column;
+  char const *p0 = token;
+  char const *p = token;
+  char const *lim = token + size;
+
+  loc->start = *cur;
+
+  for (p = token; p < lim; p++)
+    switch (*p)
+      {
+      case '\n':
+	line += line < INT_MAX;
+	column = 1;
+	p0 = p + 1;
+	break;
+
+      case '\t':
+	column = add_column_width (column, p0, p - p0);
+	column = add_column_width (column, NULL, 8 - ((column - 1) & 7));
+	p0 = p + 1;
+	break;
+
+      default:
+	break;
+      }
+
+  cur->line = line;
+  cur->column = column = add_column_width (column, p0, p - p0);
+
+  loc->end = *cur;
+
+  if (line == INT_MAX && loc->start.line != INT_MAX)
+    warn_at (*loc, _("line number overflow"));
+  if (column == INT_MAX && loc->start.column != INT_MAX)
+    warn_at (*loc, _("column number overflow"));
+}
+
+
 /* Output to OUT the location LOC.
   Warning: it uses quotearg's slot 3.  */
 void
 location_print (FILE *out, location loc)
 {
+  int end_col = 0 < loc.end.column ? loc.end.column - 1 : 0;
  fprintf (out, "%s:%d.%d",
 	   quotearg_n_style (3, escape_quoting_style, loc.start.file),
 	   loc.start.line, loc.start.column);
@@ -40,9 +108,9 @@ location_print (FILE *out, location loc)
  if (loc.start.file != loc.end.file)
    fprintf (out, "-%s:%d.%d",
 	     quotearg_n_style (3, escape_quoting_style, loc.end.file),
-	     loc.end.line, loc.end.column - 1);
+	     loc.end.line, end_col);
  else if (loc.start.line < loc.end.line)
-    fprintf (out, "-%d.%d", loc.end.line, loc.end.column - 1);
-  else if (loc.start.column < loc.end.column - 1)
-    fprintf (out, "-%d", loc.end.column - 1);
+    fprintf (out, "-%d.%d", loc.end.line, end_col);
+  else if (loc.start.column < end_col)
+    fprintf (out, "-%d", end_col);
 }
@@ -40,6 +40,15 @@ typedef struct

 } boundary;

+/* Set the position of \a a. */
+static inline void
+boundary_set (boundary *b, const char *f, int l, int c)
+{
+  b->file = f;	
+  b->line = l;		
+  b->column = c;		
+}
+
 /* Return nonzero if A and B are equal boundaries.  */
 static inline bool
 equal_boundaries (boundary a, boundary b)
@@ -64,6 +73,11 @@ typedef struct

 extern location const empty_location;

+/* Set *LOC and adjust scanner cursor to account for token TOKEN of
+   size SIZE.  */
+void location_compute (location *loc,
+		       boundary *cur, char const *token, size_t size);
+
 void location_print (FILE *out, location loc);

 #endif /* ! defined LOCATION_H_ */
@@ -1,6 +1,7 @@
 /* Top level entry point of Bison.

-   Copyright (C) 1984, 1986, 1989, 1992, 1995, 2000, 2001, 2002, 2004, 2005
+   Copyright (C) 1984, 1986, 1989, 1992, 1995, 2000, 2001, 2002, 2004, 2005,
+   2006
   Free Software Foundation, Inc.

   This file is part of Bison, the GNU Compiler Compiler.
@@ -169,7 +170,7 @@ main (int argc, char *argv[])

  /* The scanner memory cannot be released right after parsing, as it
     contains things such as user actions, prologue, epilogue etc.  */
-  scanner_free ();
+  gram_scanner_free ();
  muscle_free ();
  uniqstrs_free ();
  timevar_pop (TV_FREE);
@@ -36,6 +36,7 @@
 #include "muscle_tab.h"
 #include "output.h"
 #include "reader.h"
+#include "scan-code.h"    /* max_left_semantic_context */
 #include "scan-skel.h"
 #include "symtab.h"
 #include "tables.h"
@@ -1,4 +1,4 @@
-/* A Bison parser, made by GNU Bison 2.2a.  */
+/* A Bison parser, made by GNU Bison 2.1b.  */

 /* Skeleton interface for Bison's Yacc-like parsers in C

@@ -148,7 +148,7 @@

 #if ! defined YYSTYPE && ! defined YYSTYPE_IS_DECLARED
 typedef union YYSTYPE
-#line 94 "parse-gram.y"
+#line 95 "../../src/parse-gram.y"
 {
  symbol *symbol;
  symbol_list *list;
@@ -158,7 +158,7 @@ typedef union YYSTYPE
  uniqstr uniqstr;
 }
 /* Line 1529 of yacc.c.  */
-#line 162 "parse-gram.h"
+#line 162 "../../src/parse-gram.h"
 	YYSTYPE;
 # define yystype YYSTYPE /* obsolescent; will be withdrawn */
 # define YYSTYPE_IS_DECLARED 1
@@ -32,6 +32,8 @@
 #include "quotearg.h"
 #include "reader.h"
 #include "symlist.h"
+#include "scan-gram.h"
+#include "scan-code.h"
 #include "strverscmp.h"

 #define YYLLOC_DEFAULT(Current, Rhs, N)  (Current) = lloc_default (Rhs, N)
@@ -84,9 +86,8 @@ static int current_prec = 0;
 {
  /* Bison's grammar can initial empty locations, hence a default
     location is needed. */
-  @$.start.file   = @$.end.file   = current_file;
-  @$.start.line   = @$.end.line   = 1;
-  @$.start.column = @$.end.column = 0;
+  boundary_set (&@$.start, current_file, 1, 0);
+  boundary_set (&@$.end, current_file, 1, 0);
 }

 /* Only NUMBERS have a value.  */
@@ -109,8 +110,8 @@ static int current_prec = 0;
 %token PERCENT_NTERM       "%nterm"

 %token PERCENT_TYPE        "%type"
-%token PERCENT_DESTRUCTOR  "%destructor {...}"
-%token PERCENT_PRINTER     "%printer {...}"
+%token PERCENT_DESTRUCTOR  "%destructor"
+%token PERCENT_PRINTER     "%printer"

 %token PERCENT_UNION       "%union {...}"

@@ -137,8 +138,8 @@ static int current_prec = 0;
  PERCENT_EXPECT_RR	  "%expect-rr"
  PERCENT_FILE_PREFIX     "%file-prefix"
  PERCENT_GLR_PARSER      "%glr-parser"
-  PERCENT_INITIAL_ACTION  "%initial-action {...}"
-  PERCENT_LEX_PARAM       "%lex-param {...}"
+  PERCENT_INITIAL_ACTION  "%initial-action"
+  PERCENT_LEX_PARAM       "%lex-param"
  PERCENT_LOCATIONS       "%locations"
  PERCENT_NAME_PREFIX     "%name-prefix"
  PERCENT_NO_DEFAULT_PREC "%no-default-prec"
@@ -146,7 +147,7 @@ static int current_prec = 0;
  PERCENT_NONDETERMINISTIC_PARSER
 			  "%nondeterministic-parser"
  PERCENT_OUTPUT          "%output"
-  PERCENT_PARSE_PARAM     "%parse-param {...}"
+  PERCENT_PARSE_PARAM     "%parse-param"
  PERCENT_PURE_PARSER     "%pure-parser"
  PERCENT_REQUIRE	  "%require"
  PERCENT_SKELETON        "%skeleton"
@@ -167,23 +168,14 @@ static int current_prec = 0;
 %token EPILOGUE        "epilogue"
 %token BRACED_CODE     "{...}"

-
 %type <chars> STRING string_content
-	      "%destructor {...}"
-	      "%initial-action {...}"
-	      "%lex-param {...}"
-	      "%parse-param {...}"
-	      "%printer {...}"
+	      "{...}"
 	      "%union {...}"
 	      PROLOGUE EPILOGUE
 %printer { fprintf (stderr, "\"%s\"", $$); }
 	      STRING string_content
 %printer { fprintf (stderr, "{\n%s\n}", $$); }
-	      "%destructor {...}"
-	      "%initial-action {...}"
-	      "%lex-param {...}"
-	      "%parse-param {...}"
-	      "%printer {...}"
+	      "{...}"
 	      "%union {...}"
 	      PROLOGUE EPILOGUE
 %type <uniqstr> TYPE
@@ -214,7 +206,8 @@ declarations:

 declaration:
  grammar_declaration
-| PROLOGUE                                 { prologue_augment ($1, @1); }
+| PROLOGUE                         { prologue_augment (translate_code ($1, @1),
+						       @1); }
 | "%debug"                                 { debug_flag = true; }
 | "%define" string_content
    {
@@ -232,17 +225,17 @@ declaration:
      nondeterministic_parser = true;
      glr_parser = true;
    }
-| "%initial-action {...}"
+| "%initial-action" "{...}"
    {
-      muscle_code_grow ("initial_action", $1, @1);
+      muscle_code_grow ("initial_action", translate_symbol_action ($2, @2), @2);
    }
-| "%lex-param {...}"			   { add_param ("lex_param", $1, @1); }
+| "%lex-param" "{...}"			   { add_param ("lex_param", $2, @2); }
 | "%locations"                             { locations_flag = true; }
 | "%name-prefix" "=" string_content        { spec_name_prefix = $3; }
 | "%no-lines"                              { no_lines_flag = true; }
 | "%nondeterministic-parser"		   { nondeterministic_parser = true; }
 | "%output" "=" string_content             { spec_outfile = $3; }
-| "%parse-param {...}"			   { add_param ("parse_param", $1, @1); }
+| "%parse-param" "{...}"		   { add_param ("parse_param", $2, @2); }
 | "%pure-parser"                           { pure_parser = true; }
 | "%require" string_content                { version_check (&@2, $2); }
 | "%skeleton" string_content               { skeleton = $2; }
@@ -275,19 +268,21 @@ grammar_declaration:
      typed = true;
      muscle_code_grow ("stype", body, @1);
    }
-| "%destructor {...}" symbols.1
+| "%destructor" "{...}" symbols.1
    {
      symbol_list *list;
-      for (list = $2; list; list = list->next)
-	symbol_destructor_set (list->sym, $1, @1);
-      symbol_list_free ($2);
+      const char *action = translate_symbol_action ($2, @2);
+      for (list = $3; list; list = list->next)
+ 	symbol_destructor_set (list->sym, action, @2);
+      symbol_list_free ($3);
    }
-| "%printer {...}" symbols.1
+| "%printer" "{...}" symbols.1
    {
      symbol_list *list;
-      for (list = $2; list; list = list->next)
-	symbol_printer_set (list->sym, $1, @1);
-      symbol_list_free ($2);
+      const char *action = translate_symbol_action ($2, @2);
+      for (list = $3; list; list = list->next)
+	symbol_printer_set (list->sym, action, @2);
+      symbol_list_free ($3);
    }
 | "%default-prec"
    {
@@ -346,7 +341,6 @@ type.opt:
 ;

 /* One or more nonterminals to be %typed. */
-
 symbols.1:
  symbol            { $$ = symbol_list_new ($1, @1); }
 | symbols.1 symbol  { $$ = symbol_list_prepend ($1, $2, @2); }
@@ -426,7 +420,9 @@ rhs:
    { grammar_current_rule_begin (current_lhs, current_lhs_location); }
 | rhs symbol
    { grammar_current_rule_symbol_append ($2, @2); }
-| rhs action
+| rhs "{...}"
+    { grammar_current_rule_action_append (gram_last_string,
+					  gram_last_braced_code_loc); }
 | rhs "%prec" symbol
    { grammar_current_rule_prec_set ($3, @3); }
 | rhs "%dprec" INT
@@ -440,23 +436,6 @@ symbol:
 | string_as_id    { $$ = $1; }
 ;

-/* Handle the semantics of an action specially, with a mid-rule
-   action, so that grammar_current_rule_action_append is invoked
-   immediately after the braced code is read by the scanner.
-
-   This implementation relies on the LALR(1) parsing algorithm.
-   If grammar_current_rule_action_append were executed in a normal
-   action for this rule, then when the input grammar contains two
-   successive actions, the scanner would have to read both actions
-   before reducing this rule.  That wouldn't work, since the scanner
-   relies on all preceding input actions being processed by
-   grammar_current_rule_action_append before it scans the next
-   action.  */
-action:
-    { grammar_current_rule_action_append (last_string, last_braced_code_loc); }
-  BRACED_CODE
-;
-
 /* A string used as an ID: quote it.  */
 string_as_id:
  STRING
@@ -477,8 +456,8 @@ epilogue.opt:
  /* Nothing.  */
 | "%%" EPILOGUE
    {
-      muscle_code_grow ("epilogue", $2, @2);
-      scanner_last_string_free ();
+      muscle_code_grow ("epilogue", translate_code ($2, @2), @2);
+      gram_scanner_last_string_free ();
    }
 ;

@@ -563,7 +542,7 @@ add_param (char const *type, char *decl, location loc)
      free (name);
    }

-  scanner_last_string_free ();
+  gram_scanner_last_string_free ();
 }

 static void
@@ -22,6 +22,7 @@

 #include <config.h>
 #include "system.h"
+#include <assert.h>

 #include <quotearg.h>

@@ -34,6 +35,8 @@
 #include "reader.h"
 #include "symlist.h"
 #include "symtab.h"
+#include "scan-gram.h"
+#include "scan-code.h"

 static void check_and_convert_grammar (void);

@@ -77,6 +80,8 @@ prologue_augment (const char *prologue, location loc)
    !typed ? &pre_prologue_obstack : &post_prologue_obstack;

  obstack_fgrow1 (oout, "]b4_syncline(%d, [[", loc.start.line);
+  /* FIXME: Protection of M4 characters missing here.  See
+     output.c:escaped_output.  */
  MUSCLE_OBSTACK_SGROW (oout,
 			quotearg_style (c_quoting_style, loc.start.file));
  obstack_sgrow (oout, "]])[\n");
@@ -398,9 +403,7 @@ grammar_current_rule_symbol_append (symbol *sym, location loc)
 void
 grammar_current_rule_action_append (const char *action, location loc)
 {
-  /* There's no need to invoke grammar_midrule_action here, since the
-     scanner already did it if necessary.  */
-  current_rule->action = action;
+  current_rule->action = translate_rule_action (current_rule, action, loc);
  current_rule->action_location = loc;
 }

@@ -426,6 +429,7 @@ packgram (void)

  while (p)
    {
+      int rule_length = 0;
      symbol *ruleprec = p->ruleprec;
      rules[ruleno].user_number = ruleno;
      rules[ruleno].number = ruleno;
@@ -440,18 +444,22 @@ packgram (void)
      rules[ruleno].action = p->action;
      rules[ruleno].action_location = p->action_location;

-      p = p->next;
-      while (p && p->sym)
+      for (p = p->next; p && p->sym; p = p->next)
 	{
+	  ++rule_length;
+
+	  /* Don't allow rule_length == INT_MAX, since that might
+	     cause confusion with strtol if INT_MAX == LONG_MAX.  */
+	  if (rule_length == INT_MAX)
+	      fatal_at (rules[ruleno].location, _("rule is too long"));
+
 	  /* item_number = symbol_number.
 	     But the former needs to contain more: negative rule numbers. */
 	  ritem[itemno++] = symbol_number_as_item_number (p->sym->number);
 	  /* A rule gets by default the precedence and associativity
-	     of the last token in it.  */
+	     of its last token.  */
 	  if (p->sym->class == token_sym && default_prec)
 	    rules[ruleno].prec = p->sym;
-	  if (p)
-	    p = p->next;
 	}

      /* If this rule has a %prec,
@@ -461,8 +469,11 @@ packgram (void)
 	  rules[ruleno].precsym = ruleprec;
 	  rules[ruleno].prec = ruleprec;
 	}
+      /* An item ends by the rule number (negated).  */
      ritem[itemno++] = rule_number_as_item_number (ruleno);
+      assert (itemno < ITEM_NUMBER_MAX);
      ++ruleno;
+      assert (ruleno < RULE_NUMBER_MAX);

      if (p)
 	p = p->next;
@@ -511,7 +522,7 @@ reader (void)

  gram__flex_debug = trace_flag & trace_scan;
  gram_debug = trace_flag & trace_parse;
-  scanner_initialize ();
+  gram_scanner_initialize ();
  gram_parse ();

  if (! complaint_issued)
@@ -35,26 +35,6 @@ typedef struct merger_list
  uniqstr type;
 } merger_list;

-/* From the scanner.  */
-extern FILE *gram_in;
-extern int gram__flex_debug;
-extern boundary scanner_cursor;
-extern char *last_string;
-extern location last_braced_code_loc;
-extern int max_left_semantic_context;
-void scanner_initialize (void);
-void scanner_free (void);
-void scanner_last_string_free (void);
-
-/* These are declared by the scanner, but not used.  We put them here
-   to pacify "make syntax-check".  */
-extern FILE *gram_out;
-extern int gram_lineno;
-
-# define YY_DECL int gram_lex (YYSTYPE *val, location *loc)
-YY_DECL;
-
-
 /* From the parser.  */
 extern int gram_debug;
 int gram_parse (void);
@@ -0,0 +1,866 @@
+/* Bison Grammar Scanner                             -*- C -*-
+
+   Copyright (C) 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of Bison, the GNU Compiler Compiler.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+   02110-1301  USA
+*/
+
+%option debug nodefault nounput noyywrap never-interactive
+%option prefix="gram_" outfile="lex.yy.c"
+
+%{
+#include "system.h"
+
+#include <mbswidth.h>
+#include <get-errno.h>
+#include <quote.h>
+
+#include "complain.h"
+#include "files.h"
+#include "getargs.h"
+#include "gram.h"
+#include "quotearg.h"
+#include "reader.h"
+#include "uniqstr.h"
+
+#define YY_USER_INIT					\
+  do							\
+    {							\
+      scanner_cursor.file = current_file;		\
+      scanner_cursor.line = 1;				\
+      scanner_cursor.column = 1;			\
+      code_start = scanner_cursor;			\
+    }							\
+  while (0)
+
+/* Location of scanner cursor.  */
+boundary scanner_cursor;
+
+static void adjust_location (location *, char const *, size_t);
+#define YY_USER_ACTION  adjust_location (loc, yytext, yyleng);
+
+static size_t no_cr_read (FILE *, char *, size_t);
+#define YY_INPUT(buf, result, size) ((result) = no_cr_read (yyin, buf, size))
+
+/* Within well-formed rules, RULE_LENGTH is the number of values in
+   the current rule so far, which says where to find `$0' with respect
+   to the top of the stack.  It is not the same as the rule->length in
+   the case of mid rule actions.
+
+   Outside of well-formed rules, RULE_LENGTH has an undefined value.  */
+int rule_length;
+
+static void handle_dollar (int token_type, char *cp, location loc);
+static void handle_at (int token_type, char *cp, location loc);
+static void handle_syncline (char *args);
+static unsigned long int scan_integer (char const *p, int base, location loc);
+static int convert_ucn_to_byte (char const *hex_text);
+static void unexpected_eof (boundary, char const *);
+static void unexpected_newline (boundary, char const *);
+
+%}
+%x SC_COMMENT SC_LINE_COMMENT SC_YACC_COMMENT
+%x SC_STRING SC_CHARACTER
+%x SC_ESCAPED_STRING SC_ESCAPED_CHARACTER
+%x SC_PRE_CODE SC_BRACED_CODE SC_PROLOGUE SC_EPILOGUE
+
+letter	  [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
+id	  {letter}({letter}|[0-9])*
+directive %{letter}({letter}|[0-9]|-)*
+int	  [0-9]+
+
+/* POSIX says that a tag must be both an id and a C union member, but
+   historically almost any character is allowed in a tag.  We disallow
+   NUL and newline, as this simplifies our implementation.  */
+tag	 [^\0\n>]+
+
+/* Zero or more instances of backslash-newline.  Following GCC, allow
+   white space between the backslash and the newline.  */
+splice	 (\\[ \f\t\v]*\n)*
+
+%%
+%{
+  /* Nesting level of the current code in braces.  */
+  int braces_level IF_LINT (= 0);
+
+  /* Parent context state, when applicable.  */
+  int context_state IF_LINT (= 0);
+
+  /* Token type to return, when applicable.  */
+  int token_type IF_LINT (= 0);
+
+  /* Where containing code started, when applicable.  Its initial
+     value is relevant only when yylex is invoked in the SC_EPILOGUE
+     start condition.  */
+  boundary code_start = scanner_cursor;
+
+  /* Where containing comment or string or character literal started,
+     when applicable.  */
+  boundary token_start IF_LINT (= scanner_cursor);
+%}
+
+
+  /*-----------------------.
+  | Scanning white space.  |
+  `-----------------------*/
+
+<INITIAL>
+{
+  /* Comments and white space.  */
+  ","	       warn_at (*loc, _("stray `,' treated as white space"));
+  [ \f\n\t\v]  |
+  "//".*       ;
+  "/*" {
+    token_start = loc->start;
+    context_state = YY_START;
+    BEGIN SC_YACC_COMMENT;
+  }
+
+  /* #line directives are not documented, and may be withdrawn or
+     modified in future versions of Bison.  */
+  ^"#line "{int}" \"".*"\"\n" {
+    handle_syncline (yytext + sizeof "#line " - 1);
+  }
+}
+
+
+  /*----------------------------.
+  | Scanning Bison directives.  |
+  `----------------------------*/
+<INITIAL>
+{
+
+  /* Code in between braces.  */
+  "{" {
+    STRING_GROW;
+    token_type = BRACED_CODE;
+    braces_level = 0;
+    code_start = loc->start;
+    BEGIN SC_BRACED_CODE;
+  }
+
+}
+
+
+  /*------------------------------------------------------------.
+  | Scanning a C comment.  The initial `/ *' is already eaten.  |
+  `------------------------------------------------------------*/
+
+<SC_COMMENT>
+{
+  "*"{splice}"/"  STRING_GROW; BEGIN context_state;
+  <<EOF>>	  unexpected_eof (token_start, "*/"); BEGIN context_state;
+}
+
+
+  /*--------------------------------------------------------------.
+  | Scanning a line comment.  The initial `//' is already eaten.  |
+  `--------------------------------------------------------------*/
+
+<SC_LINE_COMMENT>
+{
+  "\n"		 STRING_GROW; BEGIN context_state;
+  {splice}	 STRING_GROW;
+  <<EOF>>	 BEGIN context_state;
+}
+
+
+  /*------------------------------------------------.
+  | Scanning a Bison string, including its escapes. |
+  | The initial quote is already eaten.             |
+  `------------------------------------------------*/
+
+<SC_ESCAPED_STRING>
+{
+  "\"" {
+    STRING_FINISH;
+    loc->start = token_start;
+    val->chars = last_string;
+    rule_length++;
+    BEGIN INITIAL;
+    return STRING;
+  }
+  \n		unexpected_newline (token_start, "\"");	BEGIN INITIAL;
+  <<EOF>>	unexpected_eof (token_start, "\"");	BEGIN INITIAL;
+}
+
+  /*----------------------------------------------------------.
+  | Scanning a Bison character literal, decoding its escapes. |
+  | The initial quote is already eaten.			      |
+  `----------------------------------------------------------*/
+
+<SC_ESCAPED_CHARACTER>
+{
+  "'" {
+    unsigned char last_string_1;
+    STRING_GROW;
+    STRING_FINISH;
+    loc->start = token_start;
+    val->symbol = symbol_get (quotearg_style (escape_quoting_style,
+					      last_string),
+			      *loc);
+    symbol_class_set (val->symbol, token_sym, *loc);
+    last_string_1 = last_string[1];
+    symbol_user_token_number_set (val->symbol, last_string_1, *loc);
+    STRING_FREE;
+    rule_length++;
+    BEGIN INITIAL;
+    return ID;
+  }
+  \n		unexpected_newline (token_start, "'");	BEGIN INITIAL;
+  <<EOF>>	unexpected_eof (token_start, "'");	BEGIN INITIAL;
+}
+
+<SC_ESCAPED_CHARACTER,SC_ESCAPED_STRING>
+{
+  \0	    complain_at (*loc, _("invalid null character"));
+}
+
+
+  /*----------------------------.
+  | Decode escaped characters.  |
+  `----------------------------*/
+
+<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>
+{
+  \\[0-7]{1,3} {
+    unsigned long int c = strtoul (yytext + 1, 0, 8);
+    if (UCHAR_MAX < c)
+      complain_at (*loc, _("invalid escape sequence: %s"), quote (yytext));
+    else if (! c)
+      complain_at (*loc, _("invalid null character: %s"), quote (yytext));
+    else
+      obstack_1grow (&obstack_for_string, c);
+  }
+
+  \\x[0-9abcdefABCDEF]+ {
+    unsigned long int c;
+    set_errno (0);
+    c = strtoul (yytext + 2, 0, 16);
+    if (UCHAR_MAX < c || get_errno ())
+      complain_at (*loc, _("invalid escape sequence: %s"), quote (yytext));
+    else if (! c)
+      complain_at (*loc, _("invalid null character: %s"), quote (yytext));
+    else
+      obstack_1grow (&obstack_for_string, c);
+  }
+
+  \\a	obstack_1grow (&obstack_for_string, '\a');
+  \\b	obstack_1grow (&obstack_for_string, '\b');
+  \\f	obstack_1grow (&obstack_for_string, '\f');
+  \\n	obstack_1grow (&obstack_for_string, '\n');
+  \\r	obstack_1grow (&obstack_for_string, '\r');
+  \\t	obstack_1grow (&obstack_for_string, '\t');
+  \\v	obstack_1grow (&obstack_for_string, '\v');
+
+  /* \\[\"\'?\\] would be shorter, but it confuses xgettext.  */
+  \\("\""|"'"|"?"|"\\")  obstack_1grow (&obstack_for_string, yytext[1]);
+
+  \\(u|U[0-9abcdefABCDEF]{4})[0-9abcdefABCDEF]{4} {
+    int c = convert_ucn_to_byte (yytext);
+    if (c < 0)
+      complain_at (*loc, _("invalid escape sequence: %s"), quote (yytext));
+    else if (! c)
+      complain_at (*loc, _("invalid null character: %s"), quote (yytext));
+    else
+      obstack_1grow (&obstack_for_string, c);
+  }
+  \\(.|\n)	{
+    complain_at (*loc, _("unrecognized escape sequence: %s"), quote (yytext));
+    STRING_GROW;
+  }
+}
+
+  /*--------------------------------------------.
+  | Scanning user-code characters and strings.  |
+  `--------------------------------------------*/
+
+<SC_CHARACTER,SC_STRING>
+{
+  {splice}|\\{splice}[^\n$@\[\]]	STRING_GROW;
+}
+
+<SC_CHARACTER>
+{
+  "'"		STRING_GROW; BEGIN context_state;
+  \n		unexpected_newline (token_start, "'"); BEGIN context_state;
+  <<EOF>>	unexpected_eof (token_start, "'"); BEGIN context_state;
+}
+
+<SC_STRING>
+{
+  "\""		STRING_GROW; BEGIN context_state;
+  \n		unexpected_newline (token_start, "\""); BEGIN context_state;
+  <<EOF>>	unexpected_eof (token_start, "\""); BEGIN context_state;
+}
+
+
+  /*---------------------------------------------------.
+  | Strings, comments etc. can be found in user code.  |
+  `---------------------------------------------------*/
+
+<INITIAL>
+{
+  "'" {
+    STRING_GROW;
+    context_state = YY_START;
+    token_start = loc->start;
+    BEGIN SC_CHARACTER;
+  }
+  "\"" {
+    STRING_GROW;
+    context_state = YY_START;
+    token_start = loc->start;
+    BEGIN SC_STRING;
+  }
+  "/"{splice}"*" {
+    STRING_GROW;
+    context_state = YY_START;
+    token_start = loc->start;
+    BEGIN SC_COMMENT;
+  }
+  "/"{splice}"/" {
+    STRING_GROW;
+    context_state = YY_START;
+    BEGIN SC_LINE_COMMENT;
+  }
+}
+
+
+  /*---------------------------------------------------------------.
+  | Scanning some code in braces (%union and actions). The initial |
+  | "{" is already eaten.                                          |
+  `---------------------------------------------------------------*/
+
+<INITIAL>
+{
+  "{"|"<"{splice}"%"  STRING_GROW; braces_level++;
+  "%"{splice}">"      STRING_GROW; braces_level--;
+  "}" {
+    bool outer_brace = --braces_level < 0;
+
+    /* As an undocumented Bison extension, append `;' before the last
+       brace in braced code, so that the user code can omit trailing
+       `;'.  But do not append `;' if emulating Yacc, since Yacc does
+       not append one.
+
+       FIXME: Bison should warn if a semicolon seems to be necessary
+       here, and should omit the semicolon if it seems unnecessary
+       (e.g., after ';', '{', or '}', each followed by comments or
+       white space).  Such a warning shouldn't depend on --yacc; it
+       should depend on a new --pedantic option, which would cause
+       Bison to warn if it detects an extension to POSIX.  --pedantic
+       should also diagnose other Bison extensions like %yacc.
+       Perhaps there should also be a GCC-style --pedantic-errors
+       option, so that such warnings are diagnosed as errors.  */
+    if (outer_brace && token_type == BRACED_CODE && ! yacc_flag)
+      obstack_1grow (&obstack_for_string, ';');
+
+    obstack_1grow (&obstack_for_string, '}');
+
+    if (outer_brace)
+      {
+	STRING_FINISH;
+	rule_length++;
+	loc->start = code_start;
+	val->chars = last_string;
+	BEGIN INITIAL;
+	return token_type;
+      }
+  }
+
+  /* Tokenize `<<%' correctly (as `<<' `%') rather than incorrrectly
+     (as `<' `<%').  */
+  "<"{splice}"<"  STRING_GROW;
+
+  "$"("<"{tag}">")?(-?[0-9]+|"$")  handle_dollar (token_type, yytext, *loc);
+  "@"(-?[0-9]+|"$")		   handle_at (token_type, yytext, *loc);
+
+  <<EOF>>  unexpected_eof (code_start, "}"); BEGIN INITIAL;
+}
+
+
+  /*--------------------------------------------------------------.
+  | Scanning some prologue: from "%{" (already scanned) to "%}".  |
+  `--------------------------------------------------------------*/
+
+<SC_PROLOGUE>
+{
+  "%}" {
+    STRING_FINISH;
+    loc->start = code_start;
+    val->chars = last_string;
+    BEGIN INITIAL;
+    return PROLOGUE;
+  }
+
+  <<EOF>>  unexpected_eof (code_start, "%}"); BEGIN INITIAL;
+}
+
+
+  /*---------------------------------------------------------------.
+  | Scanning the epilogue (everything after the second "%%", which |
+  | has already been eaten).                                       |
+  `---------------------------------------------------------------*/
+
+<SC_EPILOGUE>
+{
+  <<EOF>> {
+    STRING_FINISH;
+    loc->start = code_start;
+    val->chars = last_string;
+    BEGIN INITIAL;
+    return EPILOGUE;
+  }
+}
+
+
+  /*-----------------------------------------.
+  | Escape M4 quoting characters in C code.  |
+  `-----------------------------------------*/
+
+<SC_COMMENT,SC_LINE_COMMENT,SC_STRING,SC_CHARACTER,SC_BRACED_CODE,SC_PROLOGUE,SC_EPILOGUE>
+{
+  \$	obstack_sgrow (&obstack_for_string, "$][");
+  \@	obstack_sgrow (&obstack_for_string, "@@");
+  \[	obstack_sgrow (&obstack_for_string, "@{");
+  \]	obstack_sgrow (&obstack_for_string, "@}");
+}
+
+
+  /*-----------------------------------------------------.
+  | By default, grow the string obstack with the input.  |
+  `-----------------------------------------------------*/
+
+<SC_COMMENT,SC_LINE_COMMENT,SC_BRACED_CODE,SC_PROLOGUE,SC_EPILOGUE,SC_STRING,SC_CHARACTER,SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>.	|
+<SC_COMMENT,SC_LINE_COMMENT,SC_BRACED_CODE,SC_PROLOGUE,SC_EPILOGUE>\n	STRING_GROW;
+
+%%
+
+/* Keeps track of the maximum number of semantic values to the left of
+   a handle (those referenced by $0, $-1, etc.) are required by the
+   semantic actions of this grammar. */
+int max_left_semantic_context = 0;
+
+/* Set *LOC and adjust scanner cursor to account for token TOKEN of
+   size SIZE.  */
+
+static void
+adjust_location (location *loc, char const *token, size_t size)
+{
+  int line = scanner_cursor.line;
+  int column = scanner_cursor.column;
+  char const *p0 = token;
+  char const *p = token;
+  char const *lim = token + size;
+
+  loc->start = scanner_cursor;
+
+  for (p = token; p < lim; p++)
+    switch (*p)
+      {
+      case '\n':
+	line++;
+	column = 1;
+	p0 = p + 1;
+	break;
+
+      case '\t':
+	column += mbsnwidth (p0, p - p0, 0);
+	column += 8 - ((column - 1) & 7);
+	p0 = p + 1;
+	break;
+      }
+
+  scanner_cursor.line = line;
+  scanner_cursor.column = column + mbsnwidth (p0, p - p0, 0);
+
+  loc->end = scanner_cursor;
+}
+
+
+/* Read bytes from FP into buffer BUF of size SIZE.  Return the
+   number of bytes read.  Remove '\r' from input, treating \r\n
+   and isolated \r as \n.  */
+
+static size_t
+no_cr_read (FILE *fp, char *buf, size_t size)
+{
+  size_t bytes_read = fread (buf, 1, size, fp);
+  if (bytes_read)
+    {
+      char *w = memchr (buf, '\r', bytes_read);
+      if (w)
+	{
+	  char const *r = ++w;
+	  char const *lim = buf + bytes_read;
+
+	  for (;;)
+	    {
+	      /* Found an '\r'.  Treat it like '\n', but ignore any
+		 '\n' that immediately follows.  */
+	      w[-1] = '\n';
+	      if (r == lim)
+		{
+		  int ch = getc (fp);
+		  if (ch != '\n' && ungetc (ch, fp) != ch)
+		    break;
+		}
+	      else if (*r == '\n')
+		r++;
+
+	      /* Copy until the next '\r'.  */
+	      do
+		{
+		  if (r == lim)
+		    return w - buf;
+		}
+	      while ((*w++ = *r++) != '\r');
+	    }
+
+	  return w - buf;
+	}
+    }
+
+  return bytes_read;
+}
+
+
+/*------------------------------------------------------------------.
+| TEXT is pointing to a wannabee semantic value (i.e., a `$').      |
+|                                                                   |
+| Possible inputs: $[<TYPENAME>]($|integer)                         |
+|                                                                   |
+| Output to OBSTACK_FOR_STRING a reference to this semantic value.  |
+`------------------------------------------------------------------*/
+
+static inline bool
+handle_action_dollar (char *text, location loc)
+{
+  const char *type_name = NULL;
+  char *cp = text + 1;
+
+  if (! current_rule)
+    return false;
+
+  /* Get the type name if explicit. */
+  if (*cp == '<')
+    {
+      type_name = ++cp;
+      while (*cp != '>')
+	++cp;
+      *cp = '\0';
+      ++cp;
+    }
+
+  if (*cp == '$')
+    {
+      if (!type_name)
+	type_name = symbol_list_n_type_name_get (current_rule, loc, 0);
+      if (!type_name && typed)
+	complain_at (loc, _("$$ of `%s' has no declared type"),
+		     current_rule->sym->tag);
+      if (!type_name)
+	type_name = "";
+      obstack_fgrow1 (&obstack_for_string,
+		      "]b4_lhs_value([%s])[", type_name);
+    }
+  else
+    {
+      long int num;
+      set_errno (0);
+      num = strtol (cp, 0, 10);
+
+      if (INT_MIN <= num && num <= rule_length && ! get_errno ())
+	{
+	  int n = num;
+	  if (1-n > max_left_semantic_context)
+	    max_left_semantic_context = 1-n;
+	  if (!type_name && n > 0)
+	    type_name = symbol_list_n_type_name_get (current_rule, loc, n);
+	  if (!type_name && typed)
+	    complain_at (loc, _("$%d of `%s' has no declared type"),
+			 n, current_rule->sym->tag);
+	  if (!type_name)
+	    type_name = "";
+	  obstack_fgrow3 (&obstack_for_string,
+			  "]b4_rhs_value(%d, %d, [%s])[",
+			  rule_length, n, type_name);
+	}
+      else
+	complain_at (loc, _("integer out of range: %s"), quote (text));
+    }
+
+  return true;
+}
+
+
+/*----------------------------------------------------------------.
+| Map `$?' onto the proper M4 symbol, depending on its TOKEN_TYPE |
+| (are we in an action?).                                         |
+`----------------------------------------------------------------*/
+
+static void
+handle_dollar (int token_type, char *text, location loc)
+{
+  switch (token_type)
+    {
+    case BRACED_CODE:
+      if (handle_action_dollar (text, loc))
+	return;
+      break;
+
+    case PERCENT_DESTRUCTOR:
+    case PERCENT_INITIAL_ACTION:
+    case PERCENT_PRINTER:
+      if (text[1] == '$')
+	{
+	  obstack_sgrow (&obstack_for_string, "]b4_dollar_dollar[");
+	  return;
+	}
+      break;
+
+    default:
+      break;
+    }
+
+  complain_at (loc, _("invalid value: %s"), quote (text));
+}
+
+
+/*------------------------------------------------------.
+| TEXT is a location token (i.e., a `@...').  Output to |
+| OBSTACK_FOR_STRING a reference to this location.      |
+`------------------------------------------------------*/
+
+static inline bool
+handle_action_at (char *text, location loc)
+{
+  char *cp = text + 1;
+  locations_flag = true;
+
+  if (! current_rule)
+    return false;
+
+  if (*cp == '$')
+    obstack_sgrow (&obstack_for_string, "]b4_lhs_location[");
+  else
+    {
+      long int num;
+      set_errno (0);
+      num = strtol (cp, 0, 10);
+
+      if (INT_MIN <= num && num <= rule_length && ! get_errno ())
+	{
+	  int n = num;
+	  obstack_fgrow2 (&obstack_for_string, "]b4_rhs_location(%d, %d)[",
+			  rule_length, n);
+	}
+      else
+	complain_at (loc, _("integer out of range: %s"), quote (text));
+    }
+
+  return true;
+}
+
+
+/*----------------------------------------------------------------.
+| Map `@?' onto the proper M4 symbol, depending on its TOKEN_TYPE |
+| (are we in an action?).                                         |
+`----------------------------------------------------------------*/
+
+static void
+handle_at (int token_type, char *text, location loc)
+{
+  switch (token_type)
+    {
+    case BRACED_CODE:
+      handle_action_at (text, loc);
+      return;
+
+    case PERCENT_INITIAL_ACTION:
+    case PERCENT_DESTRUCTOR:
+    case PERCENT_PRINTER:
+      if (text[1] == '$')
+	{
+	  obstack_sgrow (&obstack_for_string, "]b4_at_dollar[");
+	  return;
+	}
+      break;
+
+    default:
+      break;
+    }
+
+  complain_at (loc, _("invalid value: %s"), quote (text));
+}
+
+
+/*------------------------------------------------------.
+| Scan NUMBER for a base-BASE integer at location LOC.  |
+`------------------------------------------------------*/
+
+static unsigned long int
+scan_integer (char const *number, int base, location loc)
+{
+  unsigned long int num;
+  set_errno (0);
+  num = strtoul (number, 0, base);
+  if (INT_MAX < num || get_errno ())
+    {
+      complain_at (loc, _("integer out of range: %s"), quote (number));
+      num = INT_MAX;
+    }
+  return num;
+}
+
+
+/*------------------------------------------------------------------.
+| Convert universal character name UCN to a single-byte character,  |
+| and return that character.  Return -1 if UCN does not correspond  |
+| to a single-byte character.					    |
+`------------------------------------------------------------------*/
+
+static int
+convert_ucn_to_byte (char const *ucn)
+{
+  unsigned long int code = strtoul (ucn + 2, 0, 16);
+
+  /* FIXME: Currently we assume Unicode-compatible unibyte characters
+     on ASCII hosts (i.e., Latin-1 on hosts with 8-bit bytes).  On
+     non-ASCII hosts we support only the portable C character set.
+     These limitations should be removed once we add support for
+     multibyte characters.  */
+
+  if (UCHAR_MAX < code)
+    return -1;
+
+#if ! ('$' == 0x24 && '@' == 0x40 && '`' == 0x60 && '~' == 0x7e)
+  {
+    /* A non-ASCII host.  Use CODE to index into a table of the C
+       basic execution character set, which is guaranteed to exist on
+       all Standard C platforms.  This table also includes '$', '@',
+       and '`', which are not in the basic execution character set but
+       which are unibyte characters on all the platforms that we know
+       about.  */
+    static signed char const table[] =
+      {
+	'\0',   -1,   -1,   -1,   -1,   -1,   -1, '\a',
+	'\b', '\t', '\n', '\v', '\f', '\r',   -1,   -1,
+	  -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
+	  -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
+	 ' ',  '!',  '"',  '#',  '$',  '%',  '&', '\'',
+	 '(',  ')',  '*',  '+',  ',',  '-',  '.',  '/',
+	 '0',  '1',  '2',  '3',  '4',  '5',  '6',  '7',
+	 '8',  '9',  ':',  ';',  '<',  '=',  '>',  '?',
+	 '@',  'A',  'B',  'C',  'D',  'E',  'F',  'G',
+	 'H',  'I',  'J',  'K',  'L',  'M',  'N',  'O',
+	 'P',  'Q',  'R',  'S',  'T',  'U',  'V',  'W',
+	 'X',  'Y',  'Z',  '[', '\\',  ']',  '^',  '_',
+	 '`',  'a',  'b',  'c',  'd',  'e',  'f',  'g',
+	 'h',  'i',  'j',  'k',  'l',  'm',  'n',  'o',
+	 'p',  'q',  'r',  's',  't',  'u',  'v',  'w',
+	 'x',  'y',  'z',  '{',  '|',  '}',  '~'
+      };
+
+    code = code < sizeof table ? table[code] : -1;
+  }
+#endif
+
+  return code;
+}
+
+
+/*----------------------------------------------------------------.
+| Handle `#line INT "FILE"'.  ARGS has already skipped `#line '.  |
+`----------------------------------------------------------------*/
+
+static void
+handle_syncline (char *args)
+{
+  int lineno = strtol (args, &args, 10);
+  const char *file = NULL;
+  file = strchr (args, '"') + 1;
+  *strchr (file, '"') = 0;
+  scanner_cursor.file = current_file = uniqstr_new (file);
+  scanner_cursor.line = lineno;
+  scanner_cursor.column = 1;
+}
+
+
+/*----------------------------------------------------------------.
+| For a token or comment starting at START, report message MSGID, |
+| which should say that an end marker was found before		  |
+| the expected TOKEN_END.					  |
+`----------------------------------------------------------------*/
+
+static void
+unexpected_end (boundary start, char const *msgid, char const *token_end)
+{
+  location loc;
+  loc.start = start;
+  loc.end = scanner_cursor;
+  complain_at (loc, _(msgid), token_end);
+}
+
+
+/*------------------------------------------------------------------------.
+| Report an unexpected EOF in a token or comment starting at START.       |
+| An end of file was encountered and the expected TOKEN_END was missing.  |
+`------------------------------------------------------------------------*/
+
+static void
+unexpected_eof (boundary start, char const *token_end)
+{
+  unexpected_end (start, N_("missing `%s' at end of file"), token_end);
+}
+
+
+/*----------------------------------------.
+| Likewise, but for unexpected newlines.  |
+`----------------------------------------*/
+
+static void
+unexpected_newline (boundary start, char const *token_end)
+{
+  unexpected_end (start, N_("missing `%s' at end of line"), token_end);
+}
+
+
+/*-------------------------.
+| Initialize the scanner.  |
+`-------------------------*/
+
+void
+scanner_initialize (void)
+{
+  obstack_init (&obstack_for_string);
+}
+
+
+/*-----------------------------------------------.
+| Free all the memory allocated to the scanner.  |
+`-----------------------------------------------*/
+
+void
+scanner_free (void)
+{
+  obstack_free (&obstack_for_string, 0);
+  /* Reclaim Flex's buffers.  */
+  yy_delete_buffer (YY_CURRENT_BUFFER);
+}
@@ -0,0 +1,2 @@
+#include <config.h>
+#include "scan-code.c"
@@ -0,0 +1,47 @@
+/* Bison Action Scanner
+
+   Copyright (C) 2006 Free Software Foundation, Inc.
+
+   This file is part of Bison, the GNU Compiler Compiler.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+   02110-1301  USA
+*/
+
+#ifndef SCAN_CODE_H_
+# define SCAN_CODE_H_
+
+# include "location.h"
+# include "symlist.h"
+
+/* Keeps track of the maximum number of semantic values to the left of
+   a handle (those referenced by $0, $-1, etc.) are required by the
+   semantic actions of this grammar. */
+extern int max_left_semantic_context;
+
+void code_scanner_free (void);
+
+/* The action A contains $$, $1 etc. referring to the values
+   of the rule R. */
+const char *translate_rule_action (symbol_list *r, const char *a, location l);
+
+/* The action A refers to $$ and @$ only, referring to a symbol. */
+const char *translate_symbol_action (const char *a, location l);
+
+/* The action contains no special escapes, just protect M4 special
+   symbols.  */
+const char *translate_code (const char *a, location l);
+
+#endif /* !SCAN_CODE_H_ */
@@ -0,0 +1,358 @@
+/* Bison Action Scanner                             -*- C -*-
+
+   Copyright (C) 2006 Free Software Foundation, Inc.
+
+   This file is part of Bison, the GNU Compiler Compiler.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+   02110-1301  USA
+*/
+
+%option debug nodefault nounput noyywrap never-interactive
+%option prefix="code_" outfile="lex.yy.c"
+
+%{
+/* Work around a bug in flex 2.5.31.  See Debian bug 333231
+   <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=333231>.  */
+#undef code_wrap
+#define code_wrap() 1
+
+#define FLEX_PREFIX(Id) code_ ## Id
+#include "flex-scanner.h"
+#include "reader.h"
+#include "getargs.h"
+#include <assert.h>
+#include <get-errno.h>
+#include <quote.h>
+
+#include "scan-code.h"
+
+/* The current calling start condition: SC_RULE_ACTION or
+   SC_SYMBOL_ACTION. */
+# define YY_DECL const char *code_lex (int sc_context)
+YY_DECL;
+
+#define YY_USER_ACTION  location_compute (loc, &loc->end, yytext, yyleng);
+
+static void handle_action_dollar (char *cp, location loc);
+static void handle_action_at (char *cp, location loc);
+static location the_location;
+static location *loc = &the_location;
+
+/* The rule being processed. */
+symbol_list *current_rule;
+%}
+ /* C and C++ comments in code. */
+%x SC_COMMENT SC_LINE_COMMENT
+ /* Strings and characters in code. */
+%x SC_STRING SC_CHARACTER
+ /* Whether in a rule or symbol action.  Specifies the translation
+    of $ and @.  */
+%x SC_RULE_ACTION SC_SYMBOL_ACTION
+
+
+/* POSIX says that a tag must be both an id and a C union member, but
+   historically almost any character is allowed in a tag.  We disallow
+   NUL and newline, as this simplifies our implementation.  */
+tag	 [^\0\n>]+
+
+/* Zero or more instances of backslash-newline.  Following GCC, allow
+   white space between the backslash and the newline.  */
+splice	 (\\[ \f\t\v]*\n)*
+
+%%
+
+%{
+  /* This scanner is special: it is invoked only once, henceforth
+     is expected to return only once.  This initialization is
+     therefore done once per action to translate. */
+  assert (sc_context == SC_SYMBOL_ACTION
+	  || sc_context == SC_RULE_ACTION
+	  || sc_context == INITIAL);
+  BEGIN sc_context;
+%}
+
+  /*------------------------------------------------------------.
+  | Scanning a C comment.  The initial `/ *' is already eaten.  |
+  `------------------------------------------------------------*/
+
+<SC_COMMENT>
+{
+  "*"{splice}"/"  STRING_GROW; BEGIN sc_context;
+}
+
+
+  /*--------------------------------------------------------------.
+  | Scanning a line comment.  The initial `//' is already eaten.  |
+  `--------------------------------------------------------------*/
+
+<SC_LINE_COMMENT>
+{
+  "\n"		 STRING_GROW; BEGIN sc_context;
+  {splice}	 STRING_GROW;
+}
+
+
+  /*--------------------------------------------.
+  | Scanning user-code characters and strings.  |
+  `--------------------------------------------*/
+
+<SC_CHARACTER,SC_STRING>
+{
+  {splice}|\\{splice}.	STRING_GROW;
+}
+
+<SC_CHARACTER>
+{
+  "'"		STRING_GROW; BEGIN sc_context;
+}
+
+<SC_STRING>
+{
+  "\""		STRING_GROW; BEGIN sc_context;
+}
+
+
+<SC_RULE_ACTION,SC_SYMBOL_ACTION>{
+  "'" {
+    STRING_GROW;
+    BEGIN SC_CHARACTER;
+  }
+  "\"" {
+    STRING_GROW;
+    BEGIN SC_STRING;
+  }
+  "/"{splice}"*" {
+    STRING_GROW;
+    BEGIN SC_COMMENT;
+  }
+  "/"{splice}"/" {
+    STRING_GROW;
+    BEGIN SC_LINE_COMMENT;
+  }
+}
+
+<SC_RULE_ACTION>
+{
+  "$"("<"{tag}">")?(-?[0-9]+|"$")   handle_action_dollar (yytext, *loc);
+  "@"(-?[0-9]+|"$")		    handle_action_at (yytext, *loc);
+
+  "$"  {
+    warn_at (*loc, _("stray `$'"));
+    obstack_sgrow (&obstack_for_string, "$][");
+  }
+  "@"  {
+    warn_at (*loc, _("stray `@'"));
+    obstack_sgrow (&obstack_for_string, "@@");
+  }
+}
+
+<SC_SYMBOL_ACTION>
+{
+  "$$"   obstack_sgrow (&obstack_for_string, "]b4_dollar_dollar[");
+  "@$"   obstack_sgrow (&obstack_for_string, "]b4_at_dollar[");
+}
+
+
+  /*-----------------------------------------.
+  | Escape M4 quoting characters in C code.  |
+  `-----------------------------------------*/
+
+<*>
+{
+  \$	obstack_sgrow (&obstack_for_string, "$][");
+  \@	obstack_sgrow (&obstack_for_string, "@@");
+  \[	obstack_sgrow (&obstack_for_string, "@{");
+  \]	obstack_sgrow (&obstack_for_string, "@}");
+}
+
+  /*-----------------------------------------------------.
+  | By default, grow the string obstack with the input.  |
+  `-----------------------------------------------------*/
+
+<*>.|\n	STRING_GROW;
+
+ /* End of processing. */
+<*><<EOF>>	 {
+                   obstack_1grow (&obstack_for_string, '\0');
+		   return obstack_finish (&obstack_for_string);
+                 }
+
+%%
+
+/* Keeps track of the maximum number of semantic values to the left of
+   a handle (those referenced by $0, $-1, etc.) are required by the
+   semantic actions of this grammar. */
+int max_left_semantic_context = 0;
+
+
+/*------------------------------------------------------------------.
+| TEXT is pointing to a wannabee semantic value (i.e., a `$').      |
+|                                                                   |
+| Possible inputs: $[<TYPENAME>]($|integer)                         |
+|                                                                   |
+| Output to OBSTACK_FOR_STRING a reference to this semantic value.  |
+`------------------------------------------------------------------*/
+
+static void
+handle_action_dollar (char *text, location loc)
+{
+  const char *type_name = NULL;
+  char *cp = text + 1;
+  int rule_length = symbol_list_length (current_rule->next);
+
+  /* Get the type name if explicit. */
+  if (*cp == '<')
+    {
+      type_name = ++cp;
+      while (*cp != '>')
+	++cp;
+      *cp = '\0';
+      ++cp;
+    }
+
+  if (*cp == '$')
+    {
+      if (!type_name)
+	type_name = symbol_list_n_type_name_get (current_rule, loc, 0);
+      if (!type_name && typed)
+	complain_at (loc, _("$$ of `%s' has no declared type"),
+		     current_rule->sym->tag);
+      if (!type_name)
+	type_name = "";
+      obstack_fgrow1 (&obstack_for_string,
+		      "]b4_lhs_value([%s])[", type_name);
+      current_rule->used = true;
+    }
+  else
+    {
+      long int num;
+      set_errno (0);
+      num = strtol (cp, 0, 10);
+      if (INT_MIN <= num && num <= rule_length && ! get_errno ())
+	{
+	  int n = num;
+	  if (1-n > max_left_semantic_context)
+	    max_left_semantic_context = 1-n;
+	  if (!type_name && n > 0)
+	    type_name = symbol_list_n_type_name_get (current_rule, loc, n);
+	  if (!type_name && typed)
+	    complain_at (loc, _("$%d of `%s' has no declared type"),
+			 n, current_rule->sym->tag);
+	  if (!type_name)
+	    type_name = "";
+	  obstack_fgrow3 (&obstack_for_string,
+			  "]b4_rhs_value(%d, %d, [%s])[",
+			  rule_length, n, type_name);
+	  symbol_list_n_used_set (current_rule, n, true);
+	}
+      else
+	complain_at (loc, _("integer out of range: %s"), quote (text));
+    }
+}
+
+
+/*------------------------------------------------------.
+| TEXT is a location token (i.e., a `@...').  Output to |
+| OBSTACK_FOR_STRING a reference to this location.      |
+`------------------------------------------------------*/
+
+static void
+handle_action_at (char *text, location loc)
+{
+  char *cp = text + 1;
+  int rule_length = symbol_list_length (current_rule->next);
+  locations_flag = true;
+
+  if (*cp == '$')
+    obstack_sgrow (&obstack_for_string, "]b4_lhs_location[");
+  else
+    {
+      long int num;
+      set_errno (0);
+      num = strtol (cp, 0, 10);
+
+      if (INT_MIN <= num && num <= rule_length && ! get_errno ())
+	{
+	  int n = num;
+	  obstack_fgrow2 (&obstack_for_string, "]b4_rhs_location(%d, %d)[",
+			  rule_length, n);
+	}
+      else
+	complain_at (loc, _("integer out of range: %s"), quote (text));
+    }
+}
+
+
+/*-------------------------.
+| Initialize the scanner.  |
+`-------------------------*/
+
+/* Translate the dollars and ats in \a a, whose location is l.
+   Depending on the \a sc_context (SC_RULE_ACTION, SC_SYMBOL_ACTION,
+   INITIAL), the processing is different.  */
+
+static const char *
+translate_action (int sc_context, const char *a, location l)
+{
+  const char *res;
+  static bool initialized = false;
+  if (!initialized)
+    {
+      obstack_init (&obstack_for_string);
+      /* The initial buffer, never used. */
+      yy_delete_buffer (YY_CURRENT_BUFFER);
+      yy_flex_debug = 0;
+      initialized = true;
+    }
+
+  loc->start = loc->end = l.start;
+  yy_switch_to_buffer (yy_scan_string (a));
+  res = code_lex (sc_context);
+  yy_delete_buffer (YY_CURRENT_BUFFER);
+
+  return res;
+}
+
+const char *
+translate_rule_action (symbol_list *r, const char *a, location l)
+{
+  current_rule = r;
+  return translate_action (SC_RULE_ACTION, a, l);
+}
+
+const char *
+translate_symbol_action (const char *a, location l)
+{
+  return translate_action (SC_SYMBOL_ACTION, a, l);
+}
+
+const char *
+translate_code (const char *a, location l)
+{
+  return translate_action (INITIAL, a, l);
+}
+
+/*-----------------------------------------------.
+| Free all the memory allocated to the scanner.  |
+`-----------------------------------------------*/
+
+void
+code_scanner_free (void)
+{
+  obstack_free (&obstack_for_string, 0);
+  /* Reclaim Flex's buffers.  */
+  yy_delete_buffer (YY_CURRENT_BUFFER);
+}
@@ -0,0 +1,44 @@
+/* Bison Grammar Scanner
+
+   Copyright (C) 2006 Free Software Foundation, Inc.
+
+   This file is part of Bison, the GNU Compiler Compiler.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+   02110-1301  USA
+*/
+
+#ifndef SCAN_GRAM_H_
+# define SCAN_GRAM_H_
+
+/* From the scanner.  */
+extern FILE *gram_in;
+extern int gram__flex_debug;
+extern boundary gram_scanner_cursor;
+extern char *gram_last_string;
+extern location gram_last_braced_code_loc;
+void gram_scanner_initialize (void);
+void gram_scanner_free (void);
+void gram_scanner_last_string_free (void);
+
+/* These are declared by the scanner, but not used.  We put them here
+   to pacify "make syntax-check".  */
+extern FILE *gram_out;
+extern int gram_lineno;
+
+# define GRAM_LEX_DECL int gram_lex (YYSTYPE *val, location *loc)
+GRAM_LEX_DECL;
+
+#endif /* !SCAN_GRAM_H_ */
@@ -29,112 +29,48 @@
 #undef gram_wrap
 #define gram_wrap() 1

-#include "system.h"
-
-#include <mbswidth.h>
-#include <quote.h>
+#define FLEX_PREFIX(Id) gram_ ## Id
+#include "flex-scanner.h"

 #include "complain.h"
 #include "files.h"
-#include "getargs.h"
+#include "getargs.h"    /* yacc_flag */
 #include "gram.h"
 #include "quotearg.h"
 #include "reader.h"
 #include "uniqstr.h"

-#define YY_USER_INIT					\
-  do							\
-    {							\
-      scanner_cursor.file = current_file;		\
-      scanner_cursor.line = 1;				\
-      scanner_cursor.column = 1;			\
-      code_start = scanner_cursor;			\
-    }							\
-  while (0)
+#include <mbswidth.h>
+#include <quote.h>

-/* Pacify "gcc -Wmissing-prototypes" when flex 2.5.31 is used.  */
-int gram_get_lineno (void);
-FILE *gram_get_in (void);
-FILE *gram_get_out (void);
-int gram_get_leng (void);
-char *gram_get_text (void);
-void gram_set_lineno (int);
-void gram_set_in (FILE *);
-void gram_set_out (FILE *);
-int gram_get_debug (void);
-void gram_set_debug (int);
-int gram_lex_destroy (void);
+#include "scan-gram.h"
+
+#define YY_DECL GRAM_LEX_DECL
+   
+#define YY_USER_INIT					\
+   code_start = scanner_cursor = loc->start;		\

 /* Location of scanner cursor.  */
 boundary scanner_cursor;

-static void adjust_location (location *, char const *, size_t);
-#define YY_USER_ACTION  adjust_location (loc, yytext, yyleng);
+#define YY_USER_ACTION  location_compute (loc, &scanner_cursor, yytext, yyleng);

 static size_t no_cr_read (FILE *, char *, size_t);
 #define YY_INPUT(buf, result, size) ((result) = no_cr_read (yyin, buf, size))

-
-/* OBSTACK_FOR_STRING -- Used to store all the characters that we need to
-   keep (to construct ID, STRINGS etc.).  Use the following macros to
-   use it.
-
-   Use STRING_GROW to append what has just been matched, and
-   STRING_FINISH to end the string (it puts the ending 0).
-   STRING_FINISH also stores this string in LAST_STRING, which can be
-   used, and which is used by STRING_FREE to free the last string.  */
-
-static struct obstack obstack_for_string;
-
 /* A string representing the most recently saved token.  */
 char *last_string;

-/* The location of the most recently saved token, if it was a
-   BRACED_CODE token; otherwise, this has an unspecified value.  */
-location last_braced_code_loc;
-
-#define STRING_GROW   \
-  obstack_grow (&obstack_for_string, yytext, yyleng)
-
-#define STRING_FINISH					\
-  do {							\
-    obstack_1grow (&obstack_for_string, '\0');		\
-    last_string = obstack_finish (&obstack_for_string);	\
-  } while (0)
-
-#define STRING_FREE \
-  obstack_free (&obstack_for_string, last_string)
-
 void
-scanner_last_string_free (void)
+gram_scanner_last_string_free (void)
 {
  STRING_FREE;
 }

-/* Within well-formed rules, RULE_LENGTH is the number of values in
-   the current rule so far, which says where to find `$0' with respect
-   to the top of the stack.  It is not the same as the rule->length in
-   the case of mid rule actions.
+/* The location of the most recently saved token, if it was a
+   BRACED_CODE token; otherwise, this has an unspecified value.  */
+location gram_last_braced_code_loc;

-   Outside of well-formed rules, RULE_LENGTH has an undefined value.  */
-static int rule_length;
-
-static void rule_length_overflow (location) __attribute__ ((__noreturn__));
-
-/* Increment the rule length by one, checking for overflow.  */
-static inline void
-increment_rule_length (location loc)
-{
-  rule_length++;
-
-  /* Don't allow rule_length == INT_MAX, since that might cause
-     confusion with strtol if INT_MAX == LONG_MAX.  */
-  if (rule_length == INT_MAX)
-    rule_length_overflow (loc);
-}
-
-static void handle_dollar (int token_type, char *cp, location loc);
-static void handle_at (int token_type, char *cp, location loc);
 static void handle_syncline (char *, location);
 static unsigned long int scan_integer (char const *p, int base, location loc);
 static int convert_ucn_to_byte (char const *hex_text);
@@ -142,11 +78,26 @@ static void unexpected_eof (boundary, char const *);
 static void unexpected_newline (boundary, char const *);

 %}
-%x SC_COMMENT SC_LINE_COMMENT SC_YACC_COMMENT
-%x SC_STRING SC_CHARACTER
-%x SC_AFTER_IDENTIFIER
+ /* A C-like comment in directives/rules. */
+%x SC_YACC_COMMENT
+ /* Strings and characters in directives/rules. */
 %x SC_ESCAPED_STRING SC_ESCAPED_CHARACTER
-%x SC_PRE_CODE SC_BRACED_CODE SC_PROLOGUE SC_EPILOGUE
+ /* A identifier was just read in directives/rules.  Special state
+    to capture the sequence `identifier :'. */
+%x SC_AFTER_IDENTIFIER
+ /* A keyword that should be followed by some code was read (e.g.
+    %printer). */
+%x SC_PRE_CODE
+
+ /* Three types of user code:
+    - prologue (code between `%{' `%}' in the first section, before %%);
+    - actions, printers, union, etc, (between braced in the middle section);
+    - epilogue (everything after the second %%). */
+%x SC_PROLOGUE SC_BRACED_CODE SC_EPILOGUE
+ /* C and C++ comments in code. */
+%x SC_COMMENT SC_LINE_COMMENT
+ /* Strings and characters in code. */
+%x SC_STRING SC_CHARACTER

 letter	  [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
 id	  {letter}({letter}|[0-9])*
@@ -221,17 +172,17 @@ splice	 (\\[ \f\t\v]*\n)*
  "%default"[-_]"prec"    return PERCENT_DEFAULT_PREC;
  "%define"               return PERCENT_DEFINE;
  "%defines"              return PERCENT_DEFINES;
-  "%destructor"		  token_type = PERCENT_DESTRUCTOR; BEGIN SC_PRE_CODE;
+  "%destructor"		  /* FIXME: Remove once %union handled differently.  */ token_type = BRACED_CODE; return PERCENT_DESTRUCTOR;
  "%dprec"		  return PERCENT_DPREC;
  "%error"[-_]"verbose"   return PERCENT_ERROR_VERBOSE;
  "%expect"               return PERCENT_EXPECT;
  "%expect"[-_]"rr"	  return PERCENT_EXPECT_RR;
  "%file-prefix"          return PERCENT_FILE_PREFIX;
  "%fixed"[-_]"output"[-_]"files"   return PERCENT_YACC;
-  "%initial-action"       token_type = PERCENT_INITIAL_ACTION; BEGIN SC_PRE_CODE;
+  "%initial-action"       /* FIXME: Remove once %union handled differently.  */ token_type = BRACED_CODE; return PERCENT_INITIAL_ACTION;
  "%glr-parser"           return PERCENT_GLR_PARSER;
  "%left"                 return PERCENT_LEFT;
-  "%lex-param"		  token_type = PERCENT_LEX_PARAM; BEGIN SC_PRE_CODE;
+  "%lex-param"		  /* FIXME: Remove once %union handled differently.  */ token_type = BRACED_CODE; return PERCENT_LEX_PARAM;
  "%locations"            return PERCENT_LOCATIONS;
  "%merge"		  return PERCENT_MERGE;
  "%name"[-_]"prefix"     return PERCENT_NAME_PREFIX;
@@ -241,9 +192,9 @@ splice	 (\\[ \f\t\v]*\n)*
  "%nondeterministic-parser"   return PERCENT_NONDETERMINISTIC_PARSER;
  "%nterm"                return PERCENT_NTERM;
  "%output"               return PERCENT_OUTPUT;
-  "%parse-param"	  token_type = PERCENT_PARSE_PARAM; BEGIN SC_PRE_CODE;
-  "%prec"                 rule_length--; return PERCENT_PREC;
-  "%printer"              token_type = PERCENT_PRINTER; BEGIN SC_PRE_CODE;
+  "%parse-param"	  /* FIXME: Remove once %union handled differently.  */ token_type = BRACED_CODE; return PERCENT_PARSE_PARAM;
+  "%prec"                 return PERCENT_PREC;
+  "%printer"              /* FIXME: Remove once %union handled differently.  */ token_type = BRACED_CODE; return PERCENT_PRINTER;
  "%pure"[-_]"parser"     return PERCENT_PURE_PARSER;
  "%require"              return PERCENT_REQUIRE;
  "%right"                return PERCENT_RIGHT;
@@ -262,13 +213,12 @@ splice	 (\\[ \f\t\v]*\n)*
  }

  "="                     return EQUAL;
-  "|"                     rule_length = 0; return PIPE;
+  "|"                     return PIPE;
  ";"                     return SEMICOLON;

  {id} {
    val->symbol = symbol_get (yytext, *loc);
    id_loc = *loc;
-    increment_rule_length (*loc);
    BEGIN SC_AFTER_IDENTIFIER;
  }

@@ -335,7 +285,6 @@ splice	 (\\[ \f\t\v]*\n)*
 <SC_AFTER_IDENTIFIER>
 {
  ":" {
-    rule_length = 0;
    *loc = id_loc;
    BEGIN INITIAL;
    return ID_COLON;
@@ -401,7 +350,6 @@ splice	 (\\[ \f\t\v]*\n)*
    STRING_FINISH;
    loc->start = token_start;
    val->chars = last_string;
-    increment_rule_length (*loc);
    BEGIN INITIAL;
    return STRING;
  }
@@ -428,7 +376,6 @@ splice	 (\\[ \f\t\v]*\n)*
    last_string_1 = last_string[1];
    symbol_user_token_number_set (val->symbol, last_string_1, *loc);
    STRING_FREE;
-    increment_rule_length (*loc);
    BEGIN INITIAL;
    return ID;
  }
@@ -501,7 +448,7 @@ splice	 (\\[ \f\t\v]*\n)*

 <SC_CHARACTER,SC_STRING>
 {
-  {splice}|\\{splice}[^\n$@\[\]]	STRING_GROW;
+  {splice}|\\{splice}[^\n\[\]]	STRING_GROW;
 }

 <SC_CHARACTER>
@@ -622,8 +569,7 @@ splice	 (\\[ \f\t\v]*\n)*
 	STRING_FINISH;
 	loc->start = code_start;
 	val->chars = last_string;
-	increment_rule_length (*loc);
-	last_braced_code_loc = *loc;
+	gram_last_braced_code_loc = *loc;
 	BEGIN INITIAL;
 	return token_type;
      }
@@ -633,18 +579,6 @@ splice	 (\\[ \f\t\v]*\n)*
     (as `<' `<%').  */
  "<"{splice}"<"  STRING_GROW;

-  "$"("<"{tag}">")?(-?[0-9]+|"$")  handle_dollar (token_type, yytext, *loc);
-  "@"(-?[0-9]+|"$")		   handle_at (token_type, yytext, *loc);
-
-  "$"  {
-    warn_at (*loc, _("stray `$'"));
-    obstack_sgrow (&obstack_for_string, "$][");
-  }
-  "@"  {
-    warn_at (*loc, _("stray `@'"));
-    obstack_sgrow (&obstack_for_string, "@@");
-  }
-
  <<EOF>>  unexpected_eof (code_start, "}"); BEGIN INITIAL;
 }

@@ -684,19 +618,6 @@ splice	 (\\[ \f\t\v]*\n)*
 }


-  /*-----------------------------------------.
-  | Escape M4 quoting characters in C code.  |
-  `-----------------------------------------*/
-
-<SC_COMMENT,SC_LINE_COMMENT,SC_STRING,SC_CHARACTER,SC_BRACED_CODE,SC_PROLOGUE,SC_EPILOGUE>
-{
-  \$	obstack_sgrow (&obstack_for_string, "$][");
-  \@	obstack_sgrow (&obstack_for_string, "@@");
-  \[	obstack_sgrow (&obstack_for_string, "@{");
-  \]	obstack_sgrow (&obstack_for_string, "@}");
-}
-
-
  /*-----------------------------------------------------.
  | By default, grow the string obstack with the input.  |
  `-----------------------------------------------------*/
@@ -706,79 +627,6 @@ splice	 (\\[ \f\t\v]*\n)*

 %%

-/* Keeps track of the maximum number of semantic values to the left of
-   a handle (those referenced by $0, $-1, etc.) are required by the
-   semantic actions of this grammar. */
-int max_left_semantic_context = 0;
-
-/* If BUF is null, add BUFSIZE (which in this case must be less than
-   INT_MAX) to COLUMN; otherwise, add mbsnwidth (BUF, BUFSIZE, 0) to
-   COLUMN.  If an overflow occurs, or might occur but is undetectable,
-   return INT_MAX.  Assume COLUMN is nonnegative.  */
-
-static inline int
-add_column_width (int column, char const *buf, size_t bufsize)
-{
-  size_t width;
-  unsigned int remaining_columns = INT_MAX - column;
-
-  if (buf)
-    {
-      if (INT_MAX / 2 <= bufsize)
-	return INT_MAX;
-      width = mbsnwidth (buf, bufsize, 0);
-    }
-  else
-    width = bufsize;
-
-  return width <= remaining_columns ? column + width : INT_MAX;
-}
-
-/* Set *LOC and adjust scanner cursor to account for token TOKEN of
-   size SIZE.  */
-
-static void
-adjust_location (location *loc, char const *token, size_t size)
-{
-  int line = scanner_cursor.line;
-  int column = scanner_cursor.column;
-  char const *p0 = token;
-  char const *p = token;
-  char const *lim = token + size;
-
-  loc->start = scanner_cursor;
-
-  for (p = token; p < lim; p++)
-    switch (*p)
-      {
-      case '\n':
-	line += line < INT_MAX;
-	column = 1;
-	p0 = p + 1;
-	break;
-
-      case '\t':
-	column = add_column_width (column, p0, p - p0);
-	column = add_column_width (column, NULL, 8 - ((column - 1) & 7));
-	p0 = p + 1;
-	break;
-
-      default:
-	break;
-      }
-
-  scanner_cursor.line = line;
-  scanner_cursor.column = column = add_column_width (column, p0, p - p0);
-
-  loc->end = scanner_cursor;
-
-  if (line == INT_MAX && loc->start.line != INT_MAX)
-    warn_at (*loc, _("line number overflow"));
-  if (column == INT_MAX && loc->start.column != INT_MAX)
-    warn_at (*loc, _("column number overflow"));
-}
-
-
 /* Read bytes from FP into buffer BUF of size SIZE.  Return the
   number of bytes read.  Remove '\r' from input, treating \r\n
   and isolated \r as \n.  */
@@ -826,173 +674,6 @@ no_cr_read (FILE *fp, char *buf, size_t size)
 }


-/*------------------------------------------------------------------.
-| TEXT is pointing to a wannabee semantic value (i.e., a `$').      |
-|                                                                   |
-| Possible inputs: $[<TYPENAME>]($|integer)                         |
-|                                                                   |
-| Output to OBSTACK_FOR_STRING a reference to this semantic value.  |
-`------------------------------------------------------------------*/
-
-static inline bool
-handle_action_dollar (char *text, location loc)
-{
-  const char *type_name = NULL;
-  char *cp = text + 1;
-
-  if (! current_rule)
-    return false;
-
-  /* Get the type name if explicit. */
-  if (*cp == '<')
-    {
-      type_name = ++cp;
-      while (*cp != '>')
-	++cp;
-      *cp = '\0';
-      ++cp;
-    }
-
-  if (*cp == '$')
-    {
-      if (!type_name)
-	type_name = symbol_list_n_type_name_get (current_rule, loc, 0);
-      if (!type_name && typed)
-	complain_at (loc, _("$$ of `%s' has no declared type"),
-		     current_rule->sym->tag);
-      if (!type_name)
-	type_name = "";
-      obstack_fgrow1 (&obstack_for_string,
-		      "]b4_lhs_value([%s])[", type_name);
-      current_rule->used = true;
-    }
-  else
-    {
-      long int num = strtol (cp, NULL, 10);
-
-      if (1 - INT_MAX + rule_length <= num && num <= rule_length)
-	{
-	  int n = num;
-	  if (max_left_semantic_context < 1 - n)
-	    max_left_semantic_context = 1 - n;
-	  if (!type_name && 0 < n)
-	    type_name = symbol_list_n_type_name_get (current_rule, loc, n);
-	  if (!type_name && typed)
-	    complain_at (loc, _("$%d of `%s' has no declared type"),
-			 n, current_rule->sym->tag);
-	  if (!type_name)
-	    type_name = "";
-	  obstack_fgrow3 (&obstack_for_string,
-			  "]b4_rhs_value(%d, %d, [%s])[",
-			  rule_length, n, type_name);
-	  symbol_list_n_used_set (current_rule, n, true);
-	}
-      else
-	complain_at (loc, _("integer out of range: %s"), quote (text));
-    }
-
-  return true;
-}
-
-
-/*----------------------------------------------------------------.
-| Map `$?' onto the proper M4 symbol, depending on its TOKEN_TYPE |
-| (are we in an action?).                                         |
-`----------------------------------------------------------------*/
-
-static void
-handle_dollar (int token_type, char *text, location loc)
-{
-  switch (token_type)
-    {
-    case BRACED_CODE:
-      if (handle_action_dollar (text, loc))
-	return;
-      break;
-
-    case PERCENT_DESTRUCTOR:
-    case PERCENT_INITIAL_ACTION:
-    case PERCENT_PRINTER:
-      if (text[1] == '$')
-	{
-	  obstack_sgrow (&obstack_for_string, "]b4_dollar_dollar[");
-	  return;
-	}
-      break;
-
-    default:
-      break;
-    }
-
-  complain_at (loc, _("invalid value: %s"), quote (text));
-}
-
-
-/*------------------------------------------------------.
-| TEXT is a location token (i.e., a `@...').  Output to |
-| OBSTACK_FOR_STRING a reference to this location.      |
-`------------------------------------------------------*/
-
-static inline bool
-handle_action_at (char *text, location loc)
-{
-  char *cp = text + 1;
-  locations_flag = true;
-
-  if (! current_rule)
-    return false;
-
-  if (*cp == '$')
-    obstack_sgrow (&obstack_for_string, "]b4_lhs_location[");
-  else
-    {
-      long int num = strtol (cp, NULL, 10);
-
-      if (1 - INT_MAX + rule_length <= num && num <= rule_length)
-	{
-	  int n = num;
-	  obstack_fgrow2 (&obstack_for_string, "]b4_rhs_location(%d, %d)[",
-			  rule_length, n);
-	}
-      else
-	complain_at (loc, _("integer out of range: %s"), quote (text));
-    }
-
-  return true;
-}
-
-
-/*----------------------------------------------------------------.
-| Map `@?' onto the proper M4 symbol, depending on its TOKEN_TYPE |
-| (are we in an action?).                                         |
-`----------------------------------------------------------------*/
-
-static void
-handle_at (int token_type, char *text, location loc)
-{
-  switch (token_type)
-    {
-    case BRACED_CODE:
-      handle_action_at (text, loc);
-      return;
-
-    case PERCENT_INITIAL_ACTION:
-    case PERCENT_DESTRUCTOR:
-    case PERCENT_PRINTER:
-      if (text[1] == '$')
-	{
-	  obstack_sgrow (&obstack_for_string, "]b4_at_dollar[");
-	  return;
-	}
-      break;
-
-    default:
-      break;
-    }
-
-  complain_at (loc, _("invalid value: %s"), quote (text));
-}
-

 /*------------------------------------------------------.
 | Scan NUMBER for a base-BASE integer at location LOC.  |
@@ -1087,20 +768,8 @@ handle_syncline (char *args, location loc)
      warn_at (loc, _("line number overflow"));
      lineno = INT_MAX;
    }
-  scanner_cursor.file = current_file = uniqstr_new (file);
-  scanner_cursor.line = lineno;
-  scanner_cursor.column = 1;
-}
-
-
-/*---------------------------------.
-| Report a rule that is too long.  |
-`---------------------------------*/
-
-static void
-rule_length_overflow (location loc)
-{
-  fatal_at (loc, _("rule is too long"));
+  current_file = uniqstr_new (file);
+  boundary_set, (&scanner_cursor, current_file, lineno, 1);
 }


@@ -1148,7 +817,7 @@ unexpected_newline (boundary start, char const *token_end)
 `-------------------------*/

 void
-scanner_initialize (void)
+gram_scanner_initialize (void)
 {
  obstack_init (&obstack_for_string);
 }
@@ -1159,7 +828,7 @@ scanner_initialize (void)
 `-----------------------------------------------*/

 void
-scanner_free (void)
+gram_scanner_free (void)
 {
  obstack_free (&obstack_for_string, 0);
  /* Reclaim Flex's buffers.  */
@@ -113,6 +113,8 @@ char *base_name (char const *name);
 # define ATTRIBUTE_UNUSED __attribute__ ((__unused__))
 #endif

+#define FUNCTION_PRINT() fprintf (stderr, "%s: ", __func__)
+
 /*------.
 | NLS.  |
 `------*/
@@ -25,33 +25,17 @@ AT_BANNER([[Input Processing.]])
 ## Invalid $n.  ##
 ## ------------ ##

-AT_SETUP([Invalid dollar-n])
+AT_SETUP([Invalid \$n and @n])

 AT_DATA([input.y],
 [[%%
 exp: { $$ = $1 ; };
-]])
-
-AT_CHECK([bison input.y], [1], [],
-[[input.y:2.13-14: integer out of range: `$1'
-]])
-
-AT_CLEANUP
-
-
-## ------------ ##
-## Invalid @n.  ##
-## ------------ ##
-
-AT_SETUP([Invalid @n])
-
-AT_DATA([input.y],
-[[%%
 exp: { @$ = @1 ; };
 ]])

 AT_CHECK([bison input.y], [1], [],
-[[input.y:2.13-14: integer out of range: `@1'
+[[input.y:2.13-14: integer out of range: `$1'
+input.y:3.13-14: integer out of range: `@1'
 ]])

 AT_CLEANUP
@@ -200,11 +184,11 @@ AT_SETUP([Torturing the Scanner])

 AT_DATA([input.y], [])
 AT_CHECK([bison input.y], [1], [],
-[[input.y:1.1: syntax error, unexpected end of file
+[[input.y:1.0: syntax error, unexpected end of file
 ]])


-AT_DATA([input.y], 
+AT_DATA([input.y],
 [{}
 ])
 AT_CHECK([bison input.y], [1], [],
@@ -346,9 +346,7 @@ AT_DATA([input.y],
 ]])

 AT_CHECK([bison input.y], [1], [],
-[[input.y:3.1: missing `{' in "%destructor {...}"
-input.y:4.1: missing `{' in "%initial-action {...}"
-input.y:4.1: syntax error, unexpected %initial-action {...}, expecting string or identifier
+[[input.y:3.1-15: syntax error, unexpected %initial-action, expecting {...}
 ]])

 AT_CLEANUP