Currently we use "quotearg" to escape the strings output in Dot. As a
result, if the user's locale is C for instance, all the non-ASCII are
escaped. Unfortunately graphviz does not interpret this style of
escaping.
For instance:
5 -> 2 [style=solid label="\"\303\221\303\271\341\271\203\303\251\342\204\235\303\264\""]
was displayed as a sequence of numbers. We now output:
5 -> 2 [style=solid label="\"Ñùṃéℝô\""]
independently of the user's locale.
* src/system.h (obstack_backslash): New.
* src/graphviz.h, src/graphviz.c (escape): Remove, use
obstack_backslash instead.
* src/print-graph.c: Likewise.
* tests/report.at: Adjust.
Currently our scanner decodes all the escapes in the strings, and we
later reescape the strings when we emit them.
This is troublesome, as we do not respect the user input. For
instance, when the user writes in UTF-8, we destroy her string when we
write it back. And this shows everywhere: in the reports we show the
escaped string instead of the actual alias:
0 $accept: . exp $end
1 exp: . exp "\342\212\225" exp
2 | . exp "+" exp
3 | . exp "+" exp
4 | . "number"
5 | . "\303\221\303\271\341\271\203\303\251\342\204\235\303\264"
"number" shift, and go to state 1
"\303\221\303\271\341\271\203\303\251\342\204\235\303\264" shift, and go to state 2
This commit preserves the user's exact spelling of the string aliases,
instead of interpreting the escapes and then reescaping. The report
now shows:
0 $accept: . exp $end
1 exp: . exp "⊕" exp
2 | . exp "+" exp
3 | . exp "+" exp
4 | . "number"
5 | . "Ñùṃéℝô"
"number" shift, and go to state 1
"Ñùṃéℝô" shift, and go to state 2
Likewise, the XML (and therefore HTML) outputs are fixed.
* src/scan-gram.l (STRING, TSTRING): Do not interpret the escapes in
the resulting string.
* src/parse-gram.y (unquote, parser_init, parser_free, unquote_free)
(handle_defines, handle_language, obstack_for_unquote): New.
Use them to unquote where needed.
* tests/regression.at, tests/report.at: Update.
This is to record the current state of the report, which escapes the
UTF-8 characters (as parse.error="verbose" does), but shouldn't (as
parse.error="detailed" does).
* tests/report.at: here.
Be robust to newer versions of Autoconf where the package URL defaults
to https instead of http.
* configure.ac (AC_INIT): Use https.
* tests/report.at: Adjust expected output s/http/https/
to match updated URL.
The format is inconsistent. For instance most sections are
indented (including "Terminals unused in grammar" for instance), but
the sections "Terminals, with rules where they appear" and
"Nonterminals, with rules where they appear" are not. Let's indent
them. Also, these two sections try to wrap the output to avoid lines
too long. Yet we don't do that in the rest of the file, for instance
when listing the lookaheads of an item.
For instance in the case of Bison's parse-gram.output we go from:
Terminals, with rules where they appear
"end of file" (0) 0
error (256) 28 88
"string" <char*> (258) 9 13 16 17 20 23 24 109 116
[...]
Nonterminals, with rules where they appear
$accept (58)
on left: 0
input (59)
on left: 1, on right: 0
prologue_declarations (60)
on left: 2 3, on right: 1 3
prologue_declaration (61)
on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24
25 26 27 28 29, on right: 3
[...]
to
Terminals, with rules where they appear
"end of file" (0) 0
error (256) 28 88
"string" <char*> (258) 9 13 16 17 20 23 24 109 116
[...]
Nonterminals, with rules where they appear
$accept (58)
on left: 0
input (59)
on left: 1
on right: 0
prologue_declarations (60)
on left: 2 3
on right: 1 3
prologue_declaration (61)
on left: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24 25 26 27 28 29
on right: 3
[...]
* src/print.c (END_TEST): Remove.
(print_terminal_symbols): Don't try to wrap the output.
(print_nonterminal_symbols): Likewise.
Make two different lines for occurrences on the left, and occurrence
on the rhs of the rules.
Indent by 4 and 8, not 3.
* src/reduce.c (reduce_output): Indent by 4, not 3.
* tests/conflicts.at, tests/existing.at, tests/reduce.at,
* tests/regression.at, tests/report.at:
Adjust.
* tests/input.at (_AT_UNUSED_VALUES_DECLARATIONS): Check
typed mid-rule actions.
* tests/report.at (Reports): Check that types of typed mid-rule
actions are reported.
* tests/actions.at (Typed mid-rule actions): Check that
the values of typed mid-rule actions are correct.