mirror of
https://git.savannah.gnu.org/git/bison.git
synced 2026-03-09 12:23:04 +00:00
parser: keep string aliases as the user wrote it
Currently our scanner decodes all the escapes in the strings, and we
later reescape the strings when we emit them.
This is troublesome, as we do not respect the user input. For
instance, when the user writes in UTF-8, we destroy her string when we
write it back. And this shows everywhere: in the reports we show the
escaped string instead of the actual alias:
0 $accept: . exp $end
1 exp: . exp "\342\212\225" exp
2 | . exp "+" exp
3 | . exp "+" exp
4 | . "number"
5 | . "\303\221\303\271\341\271\203\303\251\342\204\235\303\264"
"number" shift, and go to state 1
"\303\221\303\271\341\271\203\303\251\342\204\235\303\264" shift, and go to state 2
This commit preserves the user's exact spelling of the string aliases,
instead of interpreting the escapes and then reescaping. The report
now shows:
0 $accept: . exp $end
1 exp: . exp "⊕" exp
2 | . exp "+" exp
3 | . exp "+" exp
4 | . "number"
5 | . "Ñùṃéℝô"
"number" shift, and go to state 1
"Ñùṃéℝô" shift, and go to state 2
Likewise, the XML (and therefore HTML) outputs are fixed.
* src/scan-gram.l (STRING, TSTRING): Do not interpret the escapes in
the resulting string.
* src/parse-gram.y (unquote, parser_init, parser_free, unquote_free)
(handle_defines, handle_language, obstack_for_unquote): New.
Use them to unquote where needed.
* tests/regression.at, tests/report.at: Update.
This commit is contained in:
@@ -415,7 +415,7 @@ AT_BISON_CHECK([-fcaret -o input.c input.y], [[0]], [[]],
|
||||
input.y:25.8-14: note: previous declaration
|
||||
25 | %token SPECIAL "\\\'\?\"\a\b\f\n\r\t\v\001\201\x001\x000081??!"
|
||||
| ^~~~~~~
|
||||
input.y:26.16-63: warning: symbol "\\'?\"\a\b\f\n\r\t\v\001\201\001\201??!" used more than once as a literal string [-Wother]
|
||||
input.y:26.16-63: warning: symbol "\\\'\?\"\a\b\f\n\r\t\v\001\201\x001\x000081??!" used more than once as a literal string [-Wother]
|
||||
26 | %token SPECIAL "\\\'\?\"\a\b\f\n\r\t\v\001\201\x001\x000081??!"
|
||||
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
]])
|
||||
@@ -427,7 +427,7 @@ AT_COMPILE([input])
|
||||
# symbol name reported by the parser is exactly the same as that reported by
|
||||
# Bison itself.
|
||||
AT_PARSER_CHECK([input], 1, [],
|
||||
[[syntax error, unexpected a, expecting ]AT_ERROR_VERBOSE_IF([["\\'?\"\a\b\f\n\r\t\v\001\201\001\201??!"]], [[∃¬∩∪∀]])[
|
||||
[[syntax error, unexpected a, expecting ]AT_ERROR_VERBOSE_IF([["\\\'\?\"\a\b\f\n\r\t\v\001\201\x001\x000081??!"]], [[∃¬∩∪∀]])[
|
||||
]])
|
||||
|
||||
AT_BISON_OPTION_POPDEFS
|
||||
|
||||
Reference in New Issue
Block a user