parse.lac: implement as %define variable.

LAC = lookahead correction.  See discussion at
<http://lists.gnu.org/archive/html/bison-patches/2009-09/msg00034.html>.
However, one point there must be corrected: because of %nonassoc,
LAC is *not* always redundant for lr.type=canonical-lr.
* data/yacc.c: Accept values of "none" (default) or "full" for
parse.lac.  Accept %define parse.lac.es-capacity to specify
capacity of LAC's temporary exploratory stack.  It defaults to 20
and, for now, will not grow dynamically.
(b4_lac_flag, b4_lac_if): New m4 macros.  Evaluate as true for
parse.lac!=none.
(YYBACKUP): Invoke YY_LAC_DISCARD.
(YY_LAC_ESTABLISH, YY_LAC_DISCARD): New cpp macros that invoke
yy_lac and track when it needs to be invoked
(yy_lac): New function that, given the current stack, determines
whether a token can eventually be shifted.  Return status mimics
yyparse return status.
(yysyntax_error): Change yystate argument to yyssp so stack top
can be passed to yy_lac.  If LAC is requested, build expected
token list by invoking yy_lac for every token instead of just
checking the current state for lookaheads.  Return 2 if yy_lac
exhausts memory.
(yyparse, yypush_parse): Use local variable yy_lac_established and
cpp macros YY_LAC_ESTABLISH and YY_LAC_DISCARD to implement LAC.
Update yysyntax_error invocation.  Add yyexhaustedlab code if LAC
is requested.
* tests/conflicts.at (%nonassoc and eof): Extend to check the
effect of each of -Dlr.type=canonical-lr and -Dparse.lac=full.
(%error-verbose and consistent errors): Likewise.
(LAC: %nonassoc requires splitting canonical LR states): New test
group demonstrating how LAC can fix canonical LR.
* tests/input.at (LAC: Errors for %define): New test group.
* tests/regression.at (LAC: Exploratory stack): New test group.
(LAC: Memory exhaustion): New test group.
(cherry picked from commit bf35c71c58)

Conflicts:

	src/parse-gram.c
	src/parse-gram.h
This commit is contained in:
Joel E. Denny
2010-12-11 11:13:33 -05:00
parent dcd39f1d4a
commit ea13bea8ab
7 changed files with 795 additions and 197 deletions

View File

@@ -1,3 +1,40 @@
2010-12-11 Joel E. Denny <jdenny@clemson.edu>
parse.lac: implement as %define variable.
LAC = lookahead correction. See discussion at
<http://lists.gnu.org/archive/html/bison-patches/2009-09/msg00034.html>.
However, one point there must be corrected: because of %nonassoc,
LAC is *not* always redundant for lr.type=canonical-lr.
* data/yacc.c: Accept values of "none" (default) or "full" for
parse.lac. Accept %define parse.lac.es-capacity to specify
capacity of LAC's temporary exploratory stack. It defaults to 20
and, for now, will not grow dynamically.
(b4_lac_flag, b4_lac_if): New m4 macros. Evaluate as true for
parse.lac!=none.
(YYBACKUP): Invoke YY_LAC_DISCARD.
(YY_LAC_ESTABLISH, YY_LAC_DISCARD): New cpp macros that invoke
yy_lac and track when it needs to be invoked
(yy_lac): New function that, given the current stack, determines
whether a token can eventually be shifted. Return status mimics
yyparse return status.
(yysyntax_error): Change yystate argument to yyssp so stack top
can be passed to yy_lac. If LAC is requested, build expected
token list by invoking yy_lac for every token instead of just
checking the current state for lookaheads. Return 2 if yy_lac
exhausts memory.
(yyparse, yypush_parse): Use local variable yy_lac_established and
cpp macros YY_LAC_ESTABLISH and YY_LAC_DISCARD to implement LAC.
Update yysyntax_error invocation. Add yyexhaustedlab code if LAC
is requested.
* tests/conflicts.at (%nonassoc and eof): Extend to check the
effect of each of -Dlr.type=canonical-lr and -Dparse.lac=full.
(%error-verbose and consistent errors): Likewise.
(LAC: %nonassoc requires splitting canonical LR states): New test
group demonstrating how LAC can fix canonical LR.
* tests/input.at (LAC: Errors for %define): New test group.
* tests/regression.at (LAC: Exploratory stack): New test group.
(LAC: Memory exhaustion): New test group.
2010-11-21 Joel E. Denny <jdenny@clemson.edu>
build: use gnulib's new bootstrap_sync option.

View File

@@ -36,6 +36,16 @@ b4_use_push_for_pull_if([
b4_push_if([m4_define([b4_use_push_for_pull_flag], [[0]])],
[m4_define([b4_push_flag], [[1]])])])
# Check the value of %define parse.lac, where LAC stands for lookahead
# correction.
b4_percent_define_default([[parse.lac]], [[none]])
b4_percent_define_default([[parse.lac.es-capacity]], [[20]])
b4_percent_define_check_values([[[[parse.lac]], [[full]], [[none]]]])
b4_define_flag_if([lac])
m4_define([b4_lac_flag],
[m4_if(b4_percent_define_get([[parse.lac]]),
[none], [[0]], [[1]])])
m4_include(b4_pkgdatadir/[c.m4])
## ---------------- ##
@@ -697,7 +707,8 @@ do \
{ \
yychar = (Token); \
yylval = (Value); \
YYPOPSTACK (1); \
YYPOPSTACK (1); \]b4_lac_if([[
YY_LAC_DISCARD ("YYBACKUP"); \]])[
goto yybackup; \
} \
else \
@@ -879,9 +890,173 @@ int yydebug;
#ifndef YYMAXDEPTH
# define YYMAXDEPTH ]b4_stack_depth_max[
#endif]b4_lac_if([[
/* Establish the initial context for the current lookahead if no initial
context is currently established.
We define a context as a snapshot of the parser stacks. We define
the initial context for a lookahead as the context in which the
parser initially examines that lookahead in order to select a
syntactic action. Thus, if the lookahead eventually proves
syntactically unacceptable (possibly in a later context reached via a
series of reductions), the initial context can be used to determine
the exact set of tokens that would be syntactically acceptable in the
lookahead's place. Moreover, it is the context after which any
further semantic actions would be erroneous because they would be
determined by a syntactically unacceptable token.
YY_LAC_ESTABLISH should be invoked when a reduction is about to be
performed in an inconsistent state (which, for the purposes of LAC,
includes consistent states that don't know they're consistent because
their default reductions have been disabled). Iff there is a
lookahead token, it should also be invoked before reporting a syntax
error. This latter case is for the sake of the debugging output.
For parse.lac=full, the implementation of YY_LAC_ESTABLISH is as
follows. If no initial context is currently established for the
current lookahead, then check if that lookahead can eventually be
shifted if syntactic actions continue from the current context.
Report a syntax error if it cannot. */
#define YY_LAC_ESTABLISH \
do { \
if (!yy_lac_established) \
{ \
YYDPRINTF ((stderr, \
"LAC: initial context established for %s\n", \
yytname[yytoken])); \
yy_lac_established = 1; \
{ \
int yy_lac_status = \
yy_lac (yyssp, yytoken); \
if (yy_lac_status == 2) \
goto yyexhaustedlab; \
if (yy_lac_status == 1) \
goto yyerrlab; \
} \
} \
} while (YYID (0))
/* Discard any previous initial lookahead context because of Event,
which may be a lookahead change or an invalidation of the currently
established initial context for the current lookahead.
The most common example of a lookahead change is a shift. An example
of both cases is syntax error recovery. That is, a syntax error
occurs when the lookahead is syntactically erroneous for the
currently established initial context, so error recovery manipulates
the parser stacks to try to find a new initial context in which the
current lookahead is syntactically acceptable. If it fails to find
such a context, it discards the lookahead. */
#if YYDEBUG
# define YY_LAC_DISCARD(Event) \
do { \
if (yy_lac_established) \
{ \
if (yydebug) \
YYFPRINTF (stderr, "LAC: initial context discarded due to " \
Event "\n"); \
yy_lac_established = 0; \
} \
} while (YYID (0))
#else
# define YY_LAC_DISCARD(Event) yy_lac_established = 0
#endif
/* Given the stack whose top is *YYSSP, return 0 iff YYTOKEN can
eventually (after perhaps some reductions) be shifted, and return 1
if not. Return 2 if memory is exhausted. */
static int
yy_lac (yytype_int16 *yyssp, int yytoken)
{
yytype_int16 *yyes_prev = yyssp;
yytype_int16 yyes@{]b4_percent_define_get([[parse.lac.es-capacity]])[@};
yytype_int16 *yyesp = yyes_prev;
YYDPRINTF ((stderr, "LAC: checking lookahead %s:", yytname[yytoken]));
if (yytoken == YYUNDEFTOK)
{
YYDPRINTF ((stderr, " Always Err\n"));
return 1;
}
while (1)
{
int yyrule = yypact[*yyesp];
if (yypact_value_is_default (yyrule)
|| (yyrule += yytoken) < 0 || YYLAST < yyrule
|| yycheck[yyrule] != yytoken)
{
yyrule = yydefact[*yyesp];
if (yyrule == 0)
{
YYDPRINTF ((stderr, " Err\n"));
return 1;
}
}
else
{
yyrule = yytable[yyrule];
if (yytable_value_is_error (yyrule))
{
YYDPRINTF ((stderr, " Err\n"));
return 1;
}
if (0 < yyrule)
{
YYDPRINTF ((stderr, " S%d\n", yyrule));
return 0;
}
yyrule = -yyrule;
}
{
YYSIZE_T yylen = yyr2[yyrule];
YYDPRINTF ((stderr, " R%d", yyrule - 1));
if (yyesp != yyes_prev)
{
YYSIZE_T yysize = yyesp - yyes + 1;
if (yylen < yysize)
{
yyesp -= yylen;
yylen = 0;
}
else
{
yylen -= yysize;
yyesp = yyes_prev;
}
}
if (yylen)
yyesp = yyes_prev -= yylen;
}
{
int yystate;
{
int yylhs = yyr1[yyrule] - YYNTOKENS;
yystate = yypgoto[yylhs] + *yyesp;
if (yystate < 0 || YYLAST < yystate
|| yycheck[yystate] != *yyesp)
yystate = yydefgoto[yylhs];
else
yystate = yytable[yystate];
}
if (yyesp == yyes_prev)
{
yyesp = yyes;
*yyesp = yystate;
}
else
{
if (yyesp == yyes + (sizeof yyes / sizeof *yyes) - 1)
{
YYDPRINTF ((stderr, " (max stack size exceeded)\n"));
return 2;
}
*++yyesp = yystate;
}
YYDPRINTF ((stderr, " G%d", *yyesp));
}
}
}]])[
#if YYERROR_VERBOSE
@@ -970,15 +1145,18 @@ yytnamerr (char *yyres, const char *yystr)
# endif
/* Copy into *YYMSG, which is of size *YYMSG_ALLOC, an error message
about the unexpected token YYTOKEN while in state YYSTATE.
about the unexpected token YYTOKEN for the state stack whose top is
YYSSP.]b4_lac_if([[ In order to see if a particular token T is a
valid looakhead, invoke yy_lac (YYSSP, T).]])[
Return 0 if *YYMSG was successfully written. Return 1 if *YYMSG is
not large enough to hold the message. In that case, also set
*YYMSG_ALLOC to the required number of bytes. Return 2 if the
required number of bytes is too large to store. */
required number of bytes is too large to store]b4_lac_if([[ or if
yy_lac returned 2]])[. */
static int
yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg,
int yystate, int yytoken)
yytype_int16 *yyssp, int yytoken)
{
YYSIZE_T yysize0 = yytnamerr (0, yytname[yytoken]);
YYSIZE_T yysize = yysize0;
@@ -1009,7 +1187,12 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg,
- Don't assume there isn't a lookahead just because this state is a
consistent state with a default action. There might have been a
previous inconsistent state, consistent state with a non-default
action, or user semantic action that manipulated yychar.
action, or user semantic action that manipulated yychar.]b4_lac_if([[
In the first two cases, it might appear that the current syntax
error should have been detected in the previous state when yy_lac
was invoked. However, at that time, there might have been a
different syntax error that discarded a different initial context
during error recovery, leaving behind the current lookahead.]], [[
- Of course, the expected token list depends on states to have
correct lookahead information, and it depends on the parser not
to perform extra reductions after fetching a lookahead from the
@@ -1017,26 +1200,39 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg,
(from LALR or IELR) and default reductions corrupt the expected
token list. However, the list is correct for canonical LR with
one exception: it will still contain any token that will not be
accepted due to an error action in a later state.
accepted due to an error action in a later state.]])[
*/
if (yytoken != YYEMPTY)
{
int yyn = yypact[yystate];
int yyn = yypact[*yyssp];]b4_lac_if([[
YYDPRINTF ((stderr, "Constructing syntax error message\n"));]])[
yyarg[yycount++] = yytname[yytoken];
if (!yypact_value_is_default (yyn))
{
{]b4_lac_if([], [[
/* Start YYX at -YYN if negative to avoid negative indexes in
YYCHECK. In other words, skip the first -YYN actions for
this state because they are default actions. */
int yyxbegin = yyn < 0 ? -yyn : 0;
/* Stay within bounds of both yycheck and yytname. */
int yychecklim = YYLAST - yyn + 1;
int yyxend = yychecklim < YYNTOKENS ? yychecklim : YYNTOKENS;
int yyx;
int yyxend = yychecklim < YYNTOKENS ? yychecklim : YYNTOKENS;]])[
int yyx;]b4_lac_if([[
for (yyx = 0; yyx < YYNTOKENS; ++yyx)
if (yyx != YYTERROR && yyx != YYUNDEFTOK)
{
{
int yy_lac_status = yy_lac (yyssp, yyx);
if (yy_lac_status == 2)
return 2;
if (yy_lac_status == 1)
continue;
}]], [[
for (yyx = yyxbegin; yyx < yyxend; ++yyx)
if (yycheck[yyx + yyn] == yyx && yyx != YYTERROR
&& !yytable_value_is_error (yytable[yyx + yyn]))
{
{]])[
if (yycount == YYERROR_VERBOSE_ARGS_MAXIMUM)
{
yycount = 1;
@@ -1050,12 +1246,16 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg,
return 2;
yysize = yysize1;
}
}
}]b4_lac_if([[
# if YYDEBUG
else if (yydebug)
YYFPRINTF (stderr, "No expected tokens.\n");
# endif]])[
}
switch (yycount)
{
#define YYCASE_(N, S) \
# define YYCASE_(N, S) \
case N: \
yyformat = S; \
break
@@ -1065,7 +1265,7 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg,
YYCASE_(3, YY_("syntax error, unexpected %s, expecting %s or %s"));
YYCASE_(4, YY_("syntax error, unexpected %s, expecting %s or %s or %s"));
YYCASE_(5, YY_("syntax error, unexpected %s, expecting %s or %s or %s or %s"));
#undef YYCASE_
# undef YYCASE_
}
yysize1 = yysize + yystrlen (yyformat);
@@ -1103,7 +1303,6 @@ yysyntax_error (YYSIZE_T *yymsg_alloc, char **yymsg,
return 0;
}
#endif /* YYERROR_VERBOSE */
]b4_yydestruct_generate([b4_c_function_def])b4_push_if([], [[
@@ -1238,7 +1437,8 @@ b4_c_function_def([[yyparse]], [[int]], b4_parse_param)[
YYLTYPE yypushed_loc = yylloc;]])
])],
[b4_declare_parser_state_variables
])[
])b4_lac_if([[
int yy_lac_established = 0;]])[
int yyn;
int yyresult;
/* Lookahead token as an internal (translated) token number. */
@@ -1445,13 +1645,18 @@ yyread_pushed_token:]])[
/* If the proper action on seeing token YYTOKEN is to reduce or to
detect an error, take that action. */
yyn += yytoken;
if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken)
goto yydefault;
if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken)]b4_lac_if([[
{
YY_LAC_ESTABLISH;
goto yydefault;
}]], [[
goto yydefault;]])[
yyn = yytable[yyn];
if (yyn <= 0)
{
if (yytable_value_is_error (yyn))
goto yyerrlab;
goto yyerrlab;]b4_lac_if([[
YY_LAC_ESTABLISH;]])[
yyn = -yyn;
goto yyreduce;
}
@@ -1465,7 +1670,8 @@ yyread_pushed_token:]])[
YY_SYMBOL_PRINT ("Shifting", yytoken, &yylval, &yylloc);
/* Discard the shifted token. */
yychar = YYEMPTY;
yychar = YYEMPTY;]b4_lac_if([[
YY_LAC_DISCARD ("shift");]])[
yystate = yyn;
*++yyvsp = yylval;
@@ -1503,12 +1709,22 @@ yyreduce:
]b4_locations_if(
[[ /* Default location. */
YYLLOC_DEFAULT (yyloc, (yylsp - yylen), yylen);]])[
YY_REDUCE_PRINT (yyn);
YY_REDUCE_PRINT (yyn);]b4_lac_if([[
{
int yychar_backup = yychar;
switch (yyn)
{
]b4_user_actions[
default: break;
}
if (yychar_backup != yychar)
YY_LAC_DISCARD ("yychar change");
}]], [[
switch (yyn)
{
]b4_user_actions[
default: break;
}
}]])[
/* User semantic actions sometimes alter yychar, and that requires
that yytoken be updated with the new translation. We take the
approach of translating immediately before every use of yytoken.
@@ -1559,11 +1775,14 @@ yyerrlab:
#if ! YYERROR_VERBOSE
yyerror (]b4_yyerror_args[YY_("syntax error"));
#else
# define YYSYNTAX_ERROR yysyntax_error (&yymsg_alloc, &yymsg, yystate, \
# define YYSYNTAX_ERROR yysyntax_error (&yymsg_alloc, &yymsg, yyssp, \
yytoken)
{
char const *yymsgp = YY_("syntax error");
int yysyntax_error_status = YYSYNTAX_ERROR;
int yysyntax_error_status;]b4_lac_if([[
if (yychar != YYEMPTY)
YY_LAC_ESTABLISH;]])[
yysyntax_error_status = YYSYNTAX_ERROR;
if (yysyntax_error_status == 0)
yymsgp = yymsg;
else if (yysyntax_error_status == 1)
@@ -1668,7 +1887,11 @@ yyerrlab1:
YYPOPSTACK (1);
yystate = *yyssp;
YY_STACK_PRINT (yyss, yyssp);
}
}]b4_lac_if([[
/* If the stack popping above didn't lose the initial context for the
current lookahead token, the shift below will for sure. */
YY_LAC_DISCARD ("error recovery");]])[
*++yyvsp = yylval;
]b4_locations_if([[
@@ -1699,7 +1922,7 @@ yyabortlab:
yyresult = 1;
goto yyreturn;
#if !defined(yyoverflow) || YYERROR_VERBOSE
#if ]b4_lac_if([[1]], [[!defined(yyoverflow) || YYERROR_VERBOSE]])[
/*-------------------------------------------------.
| yyexhaustedlab -- memory exhaustion comes here. |
`-------------------------------------------------*/

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,4 @@
/* A Bison parser, made by GNU Bison 2.4.1.247-0e0f-dirty. */
/* A Bison parser, made by GNU Bison 2.4.1.252-dcd39. */
/* Skeleton interface for Bison's Yacc-like parsers in C
@@ -161,7 +161,7 @@
typedef union YYSTYPE
{
/* Line 1748 of yacc.c */
/* Line 1971 of yacc.c */
#line 94 "parse-gram.y"
symbol *symbol;
@@ -176,7 +176,7 @@ typedef union YYSTYPE
/* Line 1748 of yacc.c */
/* Line 1971 of yacc.c */
#line 181 "parse-gram.h"
} YYSTYPE;
# define YYSTYPE_IS_TRIVIAL 1

View File

@@ -94,46 +94,52 @@ main (int argc, const char *argv[])
}
]])
# Specify the output files to avoid problems on different file systems.
AT_BISON_CHECK([-o input.c input.y])
m4_pushdef([AT_NONASSOC_AND_EOF_CHECK],
[AT_BISON_CHECK([$1[ -o input.c input.y]])
AT_COMPILE([input])
m4_pushdef([AT_EXPECTING], [m4_if($2, [correct], [[, expecting $end]])])
AT_PARSER_CHECK([./input '0<0'])
AT_PARSER_CHECK([./input '0<0<0'], [1], [],
[syntax error, unexpected '<'
[syntax error, unexpected '<'AT_EXPECTING
])
AT_PARSER_CHECK([./input '0>0'])
AT_PARSER_CHECK([./input '0>0>0'], [1], [],
[syntax error, unexpected '>'
[syntax error, unexpected '>'AT_EXPECTING
])
AT_PARSER_CHECK([./input '0<0>0'], [1], [],
[syntax error, unexpected '>'
[syntax error, unexpected '>'AT_EXPECTING
])
m4_popdef([AT_EXPECTING])])
# Expected token list is missing.
AT_NONASSOC_AND_EOF_CHECK([], [[incorrect]])
# We must disable default reductions in inconsistent states in order to
# have an explicit list of all expected tokens. (However, unless we use
# canonical LR, lookahead sets are merged for different left contexts,
# so it is still possible to have extra incorrect tokens in the expected
# list. That just doesn't happen to be a problem for this test case.)
# have an explicit list of all expected tokens.
AT_NONASSOC_AND_EOF_CHECK([[-Dlr.default-reductions=consistent]],
[[correct]])
AT_BISON_CHECK([-Dlr.default-reductions=consistent -o input.c input.y])
AT_COMPILE([input])
# lr.default-reductions=consistent happens to work for this test case.
# However, for other grammars, lookahead sets can be merged for
# different left contexts, so it is still possible to have an incorrect
# expected list. Canonical LR is almost a general solution (that is, it
# can fail only when %nonassoc is used), so make sure it gives the same
# result as above.
AT_NONASSOC_AND_EOF_CHECK([[-Dlr.type=canonical-lr]], [[correct]])
AT_PARSER_CHECK([./input '0<0'])
AT_PARSER_CHECK([./input '0<0<0'], [1], [],
[syntax error, unexpected '<', expecting $end
])
# parse.lac=full is a completely general solution that does not require
# any of the above sacrifices. Of course, it does not extend the
# language-recognition power of LALR to (IE)LR, but it does ensure that
# the reported list of expected tokens matches what the given parser
# would have accepted in place of the unexpected token.
AT_NONASSOC_AND_EOF_CHECK([[-Dparse.lac=full]], [[correct]])
AT_PARSER_CHECK([./input '0>0'])
AT_PARSER_CHECK([./input '0>0>0'], [1], [],
[syntax error, unexpected '>', expecting $end
])
AT_PARSER_CHECK([./input '0<0>0'], [1], [],
[syntax error, unexpected '>', expecting $end
])
m4_popdef([AT_NONASSOC_AND_EOF_CHECK])
AT_CLEANUP
@@ -343,6 +349,18 @@ AT_CONSISTENT_ERRORS_CHECK([[%define lr.type canonical-lr]],
[AT_PREVIOUS_STATE_INPUT],
[[$end]], [[ab]])
# Only LAC gets it right.
AT_CONSISTENT_ERRORS_CHECK([[%define lr.type canonical-lr
%define parse.lac full]],
[AT_PREVIOUS_STATE_GRAMMAR],
[AT_PREVIOUS_STATE_INPUT],
[[$end]], [[b]])
AT_CONSISTENT_ERRORS_CHECK([[%define lr.type ielr
%define parse.lac full]],
[AT_PREVIOUS_STATE_GRAMMAR],
[AT_PREVIOUS_STATE_INPUT],
[[$end]], [[b]])
m4_popdef([AT_PREVIOUS_STATE_GRAMMAR])
m4_popdef([AT_PREVIOUS_STATE_INPUT])
@@ -422,6 +440,16 @@ AT_CONSISTENT_ERRORS_CHECK([[%define lr.type canonical-lr]],
[AT_USER_ACTION_INPUT],
[[$end]], [[a]])
AT_CONSISTENT_ERRORS_CHECK([[%define parse.lac full]],
[AT_USER_ACTION_GRAMMAR],
[AT_USER_ACTION_INPUT],
[['b']], [[none]])
AT_CONSISTENT_ERRORS_CHECK([[%define parse.lac full
%define lr.default-reductions accepting]],
[AT_USER_ACTION_GRAMMAR],
[AT_USER_ACTION_INPUT],
[[$end]], [[none]])
m4_popdef([AT_USER_ACTION_GRAMMAR])
m4_popdef([AT_USER_ACTION_INPUT])
@@ -431,6 +459,113 @@ AT_CLEANUP
## ------------------------------------------------------- ##
## LAC: %nonassoc requires splitting canonical LR states. ##
## ------------------------------------------------------- ##
# This test case demonstrates that, when %nonassoc is used, canonical
# LR(1) parser table construction followed by conflict resolution
# without further state splitting is not always sufficient to produce a
# parser that can detect all syntax errors as soon as possible on one
# token of lookahead. However, LAC solves the problem completely even
# with minimal LR parser tables.
AT_SETUP([[LAC: %nonassoc requires splitting canonical LR states]])
AT_DATA_GRAMMAR([[input.y]],
[[%code {
#include <stdio.h>
void yyerror (char const *);
int yylex (void);
}
%error-verbose
%nonassoc 'a'
%%
start:
'a' problem 'a' // First context.
| 'b' problem 'b' // Second context.
| 'c' reduce-nonassoc // Just makes reduce-nonassoc useful.
;
problem:
look reduce-nonassoc
| look 'a'
| look 'b'
;
// For the state reached after shifting the 'a' in these productions,
// lookahead sets are the same in both the first and second contexts.
// Thus, canonical LR reuses the same state for both contexts. However,
// the lookahead 'a' for the reduction "look: 'a'" later becomes an
// error action only in the first context. In order to immediately
// detect the syntax error on 'a' here for only the first context, this
// canonical LR state would have to be split into two states, and the
// 'a' lookahead would have to be removed from only one of the states.
look:
'a' // Reduction lookahead set is always ['a', 'b'].
| 'a' 'b'
| 'a' 'c' // 'c' is forgotten as an expected token.
;
reduce-nonassoc: %prec 'a';
%%
void
yyerror (char const *msg)
{
fprintf (stderr, "%s\n", msg);
}
int
yylex (void)
{
char const *input = "aaa";
return *input++;
}
int
main (void)
{
return yyparse ();
}
]])
# Show canonical LR's failure.
AT_BISON_CHECK([[-Dlr.type=canonical-lr -o input.c input.y]],
[[0]], [[]],
[[input.y: conflicts: 2 shift/reduce
]])
AT_COMPILE([[input]])
AT_PARSER_CHECK([[./input]], [[1]], [[]],
[[syntax error, unexpected 'a', expecting 'b'
]])
# It's corrected by LAC.
AT_BISON_CHECK([[-Dlr.type=canonical-lr -Dparse.lac=full \
-o input.c input.y]], [[0]], [[]],
[[input.y: conflicts: 2 shift/reduce
]])
AT_COMPILE([[input]])
AT_PARSER_CHECK([[./input]], [[1]], [[]],
[[syntax error, unexpected 'a', expecting 'b' or 'c'
]])
# IELR is sufficient when LAC is used.
AT_BISON_CHECK([[-Dlr.type=ielr -Dparse.lac=full -o input.c input.y]],
[[0]], [[]],
[[input.y: conflicts: 2 shift/reduce
]])
AT_COMPILE([[input]])
AT_PARSER_CHECK([[./input]], [[1]], [[]],
[[syntax error, unexpected 'a', expecting 'b' or 'c'
]])
AT_CLEANUP
## ------------------------- ##
## Unresolved SR Conflicts. ##
## ------------------------- ##

View File

@@ -1283,3 +1283,22 @@ input.y:5.19: invalid character after \-escape: \001
]])
AT_CLEANUP
## ------------------------- ##
## LAC: Errors for %define. ##
## ------------------------- ##
AT_SETUP([[LAC: Errors for %define]])
AT_DATA([[input.y]],
[[%%
start: ;
]])
# parse.lac.* options are useless if LAC isn't actually activated.
AT_BISON_CHECK([[-Dparse.lac.es-capacity-initial=1 input.y]],
[[1]], [],
[[<command line>:2: %define variable `parse.lac.es-capacity-initial' is not used
]])
AT_CLEANUP

View File

@@ -1477,3 +1477,186 @@ memory exhausted
]])
AT_CLEANUP
## ------------------------ ##
## LAC: Exploratory stack. ##
## ------------------------ ##
AT_SETUP([[LAC: Exploratory stack]])
m4_pushdef([AT_LAC_CHECK], [
AT_BISON_OPTION_PUSHDEFS([$1])
AT_DATA_GRAMMAR([input.y],
[[%code {
#include <stdio.h>
void yyerror (char const *);
int yylex (]AT_PURE_IF([[YYSTYPE *]], [[void]])[);
}
]$1[
%error-verbose
%token 'c'
%%
// default reductions in inconsistent states
// v v v v v v v v v v v v v v
S: A B A A B A A A A B A A A A A A A B C C A A A A A A A A A A A A B ;
A: 'a' | /*empty*/ { printf ("inconsistent default reduction\n"); } ;
B: 'b' ;
C: /*empty*/ { printf ("consistent default reduction\n"); } ;
%%
void
yyerror (char const *msg)
{
fprintf (stderr, "%s\n", msg);
}
int
yylex (]AT_PURE_IF([[YYSTYPE *v]], [[void]])[)
{
static char const *input = "bbbbc";]AT_PURE_IF([[
*v = 0;]])[
return *input++;
}
int
main (void)
{
yydebug = 1;
return yyparse ();
}
]])
# Give exactly the right amount of memory to be sure there's no
# off-by-one error, for example.
AT_BISON_CHECK([[-Dparse.lac=full -Dparse.lac.es-capacity=12 \
-t -o input.c input.y]], [[0]], [],
[[input.y: conflicts: 21 shift/reduce
]])
AT_COMPILE([[input]])
AT_PARSER_CHECK([[./input > stdout.txt 2> stderr.txt]], [[1]])
# Make sure syntax error doesn't forget that 'a' is expected. It would
# be forgotten without lookahead correction.
AT_CHECK([[grep 'syntax error,' stderr.txt]], [[0]],
[[syntax error, unexpected 'c', expecting 'a' or 'b'
]])
# Check number of default reductions in inconsistent states to be sure
# syntax error is detected before unnecessary reductions are performed.
AT_CHECK([[perl -0777 -ne 'print s/inconsistent default reduction//g;' \
< stdout.txt || exit 77]], [[0]], [[14]])
# Check number of default reductions in consistent states to be sure
# it is performed before the syntax error is detected.
AT_CHECK([[perl -0777 -ne 'print s/\bconsistent default reduction//g;' \
< stdout.txt || exit 77]], [[0]], [[2]])
AT_BISON_OPTION_POPDEFS
])
AT_LAC_CHECK([[%define api.push-pull pull]])
AT_LAC_CHECK([[%define api.push-pull pull %define api.pure]])
AT_LAC_CHECK([[%define api.push-pull both]])
AT_LAC_CHECK([[%define api.push-pull both %define api.pure]])
m4_popdef([AT_LAC_CHECK])
AT_CLEANUP
## ------------------------ ##
## LAC: Memory exhaustion. ##
## ------------------------ ##
AT_SETUP([[LAC: Memory exhaustion]])
m4_pushdef([AT_LAC_CHECK], [
AT_DATA_GRAMMAR([input.y],
[[%code {
#include <stdio.h>
void yyerror (char const *);
int yylex (void);
}
%error-verbose
%%
S: A A A A A A A A A ;
A: /*empty*/ | 'a' ;
%%
void
yyerror (char const *msg)
{
fprintf (stderr, "%s\n", msg);
}
int
yylex (void)
{
static char const *input = "]$1[";
return *input++;
}
int
main (void)
{
yydebug = 1;
return yyparse ();
}
]])
AT_BISON_CHECK([[-Dparse.lac=full -Dparse.lac.es-capacity=8 \
-t -o input.c input.y]], [[0]], [],
[[input.y: conflicts: 8 shift/reduce
]])
AT_COMPILE([[input]])
])
# Check for memory exhaustion during parsing.
AT_LAC_CHECK([[]])
AT_PARSER_CHECK([[./input]], [[2]], [[]],
[[Starting parse
Entering state 0
Reading a token: Now at end of input.
LAC: initial context established for $end
LAC: checking lookahead $end: R2 G3 R2 G5 R2 G6 R2 G7 R2 G8 R2 G9 R2 G10 R2 G11 R2 (max stack size exceeded)
memory exhausted
Cleanup: discarding lookahead token $end ()
Stack now 0
]])
# Induce an immediate syntax error with an undefined token, and check
# for memory exhaustion while building syntax error message.
AT_LAC_CHECK([[z]], [[0]])
AT_PARSER_CHECK([[./input]], [[2]], [[]],
[[Starting parse
Entering state 0
Reading a token: Next token is token $undefined ()
LAC: initial context established for $undefined
LAC: checking lookahead $undefined: Always Err
Constructing syntax error message
LAC: checking lookahead $end: R2 G3 R2 G5 R2 G6 R2 G7 R2 G8 R2 G9 R2 G10 R2 G11 R2 (max stack size exceeded)
syntax error
memory exhausted
Cleanup: discarding lookahead token $undefined ()
Stack now 0
]])
m4_popdef([AT_LAC_CHECK])
AT_CLEANUP