glr.cc: fix the handling of syntax_error from the scanner

Commit 90a8537e62 was right, but issued
two error messages.  Commit 80ef7e7639
tried to address that by mapping yychar and yytoken to empty, but that
completely breaks the invariants of glr.c.  In particular, yygetToken
can be called repeatedly and is expected to return the latest result,
unless yytoken is YYEMPTY.  Since the previous attempt was "recording"
that the token was coming from an exception by setting it to YYEMPTY,
instead of getting again the faulty token, we fetched another one.

Rather, revert to the first approach: map yytoken to "invalid token",
but record in yychar the fact that we come from an exception thrown in
the scanner.

* data/skeletons/glr.c (YYFAULTYTOK): New.
(yygetToken): Use it to record syntax errors from the scanner.
* tests/c++.at (Syntax error as exception): In addition to checking
syntax_error with error recovery, make sure it also behaves as
expected without.
This commit is contained in:
Akim Demaille
2019-01-03 09:43:36 +01:00
parent b90675e67a
commit 84276bc3d5
3 changed files with 51 additions and 27 deletions

View File

@@ -957,14 +957,17 @@ AT_DATA_GRAMMAR([[input.yy]],
%define parse.trace
%%
start:
thing
| start thing
start: with-recovery | '!' without-recovery;
with-recovery:
%empty
| with-recovery item
| with-recovery error { std::cerr << "caught error\n"; }
;
thing:
error { std::cerr << "caught error\n"; }
| item
without-recovery:
%empty
| without-recovery item
;
item:
@@ -988,17 +991,15 @@ yy::parser::error (const std::string &m)
AT_DATA_SOURCE([scan.cc],
[[#include "input.hh"
// 'a': valid item, 's': syntax error, 'l': lexical error.
int
yylex (yy::parser::semantic_type *)
yylex (yy::parser::semantic_type *lval)
{
// 's': syntax error, 'l': lexical error.
//
// Leave enough valid tokens to make sure we recovered from the
// previous error, otherwise we might hide some error messages
// (discarded during error recovery).
static char const *input = "asaaalaa";
switch (int res = *input++)
switch (int res = getchar ())
{
// Don't choke on echo's \n.
case '\n':
return yylex (lval);
case 'l':
throw yy::parser::syntax_error ("invalid character");
default:
@@ -1010,15 +1011,27 @@ yylex (yy::parser::semantic_type *)
AT_BISON_CHECK([[-o input.cc input.yy]])
AT_FOR_EACH_CXX([
AT_LANG_COMPILE([[input]], [[input.cc scan.cc]])
AT_LANG_COMPILE([[input]], [[input.cc scan.cc]])
AT_PARSER_CHECK([[./input]], [[0]], [[]],
# Leave enough valid tokens to make sure we recovered from the
# previous error, otherwise we might hide some error messages
# (discarded during error recovery).
AT_PARSER_CHECK([[echo "asaaalaa" | ./input ]], [[0]], [[]],
[[error: invalid expression
caught error
error: invalid character
caught error
]])
])
AT_PARSER_CHECK([[echo "!as" | ./input ]], [1], [],
[[error: invalid expression
]])
AT_PARSER_CHECK([[echo "!al" | ./input ]], [1], [],
[[error: invalid character
]])
]) # AT_FOR_EACH_CXX
AT_BISON_OPTION_POPDEFS
AT_CLEANUP
@@ -1029,6 +1042,8 @@ AT_TEST([%skeleton "glr.cc"])
m4_popdef([AT_TEST])
## ------------------ ##
## Exception safety. ##
## ------------------ ##