mirror of
https://git.savannah.gnu.org/git/bison.git
synced 2026-03-23 03:03:02 +00:00
doc: cex documentation
* NEWS, doc/bison.texi: Add documentation for conflict counterexample generation.
This commit is contained in:
committed by
Akim Demaille
parent
ed9a821caa
commit
1247d94ba6
24
NEWS
24
NEWS
@@ -4,6 +4,30 @@ GNU Bison NEWS
|
|||||||
|
|
||||||
** New features
|
** New features
|
||||||
|
|
||||||
|
*** Counterexample Generation
|
||||||
|
|
||||||
|
When given `--report=counterexamples` or `-Wcounterexamples`, bison will
|
||||||
|
now output counterexamples for conflicts in the grammar. These are
|
||||||
|
strings in the grammar which can be parsed in two ways due to the
|
||||||
|
conflict. For example:
|
||||||
|
|
||||||
|
Example exp '+' exp • '/' exp
|
||||||
|
First derivation exp ::=[ exp ::=[ exp '+' exp • ] '/' exp ]
|
||||||
|
Second derivation exp ::=[ exp '+' exp ::=[ exp • '/' exp ] ]
|
||||||
|
|
||||||
|
This is a shift/reduce conflict caused by none of the operators having
|
||||||
|
precedence, so the example can be parsed in the two ways shown. When
|
||||||
|
bison cannot find an example that can be derived in two ways, it instead
|
||||||
|
generates two examples that are the same up until the dot:
|
||||||
|
|
||||||
|
First example expr • ID $end
|
||||||
|
First derivation $accept ::=[ s ::=[ a ::=[ expr • ] ID ] $end ]
|
||||||
|
Second example expr • ID ',' ID $end
|
||||||
|
Second derivation $accept ::=[ s ::=[ a ::=[ expr ::=[ expr • ID ',' ] ] ID ] $end ]
|
||||||
|
|
||||||
|
In these cases, the parser usually doesn't have enough lookahead to
|
||||||
|
differentiate the two given examples.
|
||||||
|
|
||||||
*** File prefix mapping
|
*** File prefix mapping
|
||||||
|
|
||||||
Bison learned a new argument, `--file-prefix-map OLD=NEW`. Any file path
|
Bison learned a new argument, `--file-prefix-map OLD=NEW`. Any file path
|
||||||
|
|||||||
@@ -9797,12 +9797,69 @@ calc.y:19.1-7: @dwarning{warning}: nonterminal useless in grammar: useless [@dwa
|
|||||||
19 | @dwarning{useless: STR;}
|
19 | @dwarning{useless: STR;}
|
||||||
| @dwarning{^~~~~~~}
|
| @dwarning{^~~~~~~}
|
||||||
calc.y: @dwarning{warning}: 7 shift/reduce conflicts [@dwarning{-Wconflicts-sr}]
|
calc.y: @dwarning{warning}: 7 shift/reduce conflicts [@dwarning{-Wconflicts-sr}]
|
||||||
|
calc.y: @dwarning{warning}: rerun with option '-Wcounterexamples' to generate conflict counterexamples [@dwarning{-Wother}]
|
||||||
@end example
|
@end example
|
||||||
|
|
||||||
When given @option{--report=state}, in addition to @file{calc.tab.c}, it
|
When given @option{-Wcounterexamples}, @command{bison} will run a search for
|
||||||
creates a file @file{calc.output} with contents detailed below. The
|
strings in your grammar that better demonstrate you
|
||||||
order of the output and the exact presentation might vary, but the
|
conflicts. Counterexample generation was initially developed by Chinawat
|
||||||
interpretation is the same.
|
Isradisaikul and Andrew Myers (@pxref{Bibliography}). For @file{calc.y},
|
||||||
|
the first printed example is:
|
||||||
|
|
||||||
|
@example
|
||||||
|
Shift/reduce conflict on token '/':
|
||||||
|
Example exp '+' exp • '/' exp
|
||||||
|
First derivation exp ::=[ exp ::=[ exp '+' exp • ] '/' exp ]
|
||||||
|
Example exp '+' exp • '/' exp
|
||||||
|
Second derivation exp ::=[ exp '+' exp ::=[ exp • '/' exp ] ]
|
||||||
|
@end example
|
||||||
|
|
||||||
|
This shows two separate derivations in the grammar for the same @code{exp}:
|
||||||
|
@samp{e1 + e2 / e3}. The derivations show how your rules would parse the
|
||||||
|
given example. Here, the first derivation completes a reduction when seeing
|
||||||
|
@samp{/}, causing @samp{e1 + e2} to be grouped as an @code{exp}. The second
|
||||||
|
derivation shifts on @samp{/}, resulting in @samp{e2 / e3} being grouped as
|
||||||
|
an @code{exp}. Therefore, it is easy to see that adding @code{%precedence}
|
||||||
|
directives would fix this conflict.
|
||||||
|
|
||||||
|
Sometimes, the search will not find an example that can be derived in two
|
||||||
|
ways. In these cases, counterexample generation will provide two examples
|
||||||
|
that are the same up until the dot. Most notably, this will happen when
|
||||||
|
your grammar requires a stronger parser (more lookahead, LR instead of
|
||||||
|
LALR). The following example isn't LR(1):
|
||||||
|
|
||||||
|
@example
|
||||||
|
%token ID
|
||||||
|
%%
|
||||||
|
s: a ID
|
||||||
|
a: expr
|
||||||
|
expr: %empty | expr ID ','
|
||||||
|
@end example
|
||||||
|
|
||||||
|
@command{bison} reports:
|
||||||
|
|
||||||
|
@example
|
||||||
|
Shift/reduce conflict on token ID:
|
||||||
|
First example expr • ID $end
|
||||||
|
First derivation $accept ::=[ s ::=[ a ::=[ expr • ] ID ] $end ]
|
||||||
|
Second example expr • ID ',' ID $end
|
||||||
|
Second derivation $accept ::=[ s ::=[ a ::=[ expr ::=[ expr • ID ',' ] ] ID ] $end ]
|
||||||
|
@end example
|
||||||
|
|
||||||
|
This conflict is caused by the parser not having enough information to know
|
||||||
|
the difference between these two examples. The parser would need an
|
||||||
|
additional lookahead token to know whether or not a comma follows the
|
||||||
|
@code{ID} after @code{expr}. These types of conflicts tend to be more
|
||||||
|
difficult to fix, and usually need a rework of the grammar. In this case,
|
||||||
|
it can be fixed by changing around the recursion: @code{expr: ID | ',' expr
|
||||||
|
ID}.
|
||||||
|
|
||||||
|
Counterexamples can also be written to a file with @option{--report=cex}.
|
||||||
|
|
||||||
|
Going back to the calc example, when given @option{--report=state},
|
||||||
|
in addition to @file{calc.tab.c}, it creates a file @file{calc.output}
|
||||||
|
with contents detailed below. The order of the output and the exact
|
||||||
|
presentation might vary, but the interpretation is the same.
|
||||||
|
|
||||||
@noindent
|
@noindent
|
||||||
@cindex token, useless
|
@cindex token, useless
|
||||||
@@ -15211,6 +15268,7 @@ resolution. @xref{Unreachable States}.
|
|||||||
@unnumbered Bibliography
|
@unnumbered Bibliography
|
||||||
|
|
||||||
@c Please follow the following canvas to add more references.
|
@c Please follow the following canvas to add more references.
|
||||||
|
@c And keep sorted alphabetically.
|
||||||
|
|
||||||
@table @asis
|
@table @asis
|
||||||
@item [Corbett 1984]
|
@item [Corbett 1984]
|
||||||
@@ -15252,6 +15310,14 @@ Look-Ahead Sets, in @cite{ACM Transactions on Programming Languages and
|
|||||||
Systems}, Vol.@: 4, No.@: 4 (October 1982), pp.@:
|
Systems}, Vol.@: 4, No.@: 4 (October 1982), pp.@:
|
||||||
615--649. @uref{http://dx.doi.org/10.1145/69622.357187}
|
615--649. @uref{http://dx.doi.org/10.1145/69622.357187}
|
||||||
|
|
||||||
|
@item [Isradisaikul, Myers 2015]
|
||||||
|
Chinawat Isradisaikul, Andrew Myers,
|
||||||
|
Finding Counterexamples from Parsing Conflicts,
|
||||||
|
in @cite{Proceedings of the 36th ACM SIGPLAN Conference on
|
||||||
|
Programming Language Design and Implementation} (PLDI '15),
|
||||||
|
ACM, pp.@: 555--564.
|
||||||
|
@uref{https://www.cs.cornell.edu/andru/papers/cupex/cupex.pdf}
|
||||||
|
|
||||||
@item [Johnson 1978]
|
@item [Johnson 1978]
|
||||||
Steven C. Johnson,
|
Steven C. Johnson,
|
||||||
A portable compiler: theory and practice,
|
A portable compiler: theory and practice,
|
||||||
@@ -15345,17 +15411,18 @@ London, Department of Computer Science, TR-00-12 (December 2000).
|
|||||||
@c LocalWords: typename emplace Wconversion Wshorten yacchack reentrancy ou
|
@c LocalWords: typename emplace Wconversion Wshorten yacchack reentrancy ou
|
||||||
@c LocalWords: Relocatability exprs fixit Wyacc parseable fixits ffixit svg
|
@c LocalWords: Relocatability exprs fixit Wyacc parseable fixits ffixit svg
|
||||||
@c LocalWords: DNDEBUG cstring Wzero workalike POPL workalikes byacc UCB
|
@c LocalWords: DNDEBUG cstring Wzero workalike POPL workalikes byacc UCB
|
||||||
@c LocalWords: Penello's Penello Byson Byson's Corbett's CSD TOPLAS PDP
|
@c LocalWords: Penello's Penello Byson Byson's Corbett's CSD TOPLAS PDP cex
|
||||||
@c LocalWords: Beazley's goyacc ocamlyacc SIGACT SIGPLAN colorWarning exVal
|
@c LocalWords: Beazley's goyacc ocamlyacc SIGACT SIGPLAN colorWarning exVal
|
||||||
@c LocalWords: setcolor rgbError colorError rgbNotice colorNotice derror
|
@c LocalWords: setcolor rgbError colorError rgbNotice colorNotice derror
|
||||||
@c LocalWords: colorOff maincolor inlineraw darkviolet darkcyan dwarning
|
@c LocalWords: colorOff maincolor inlineraw darkviolet darkcyan dwarning
|
||||||
@c LocalWords: dnotice copyable stdint ptrdiff bufsize yyreport invariants
|
@c LocalWords: dnotice copyable stdint ptrdiff bufsize yyreport invariants
|
||||||
@c LocalWords: xrefautomaticsectiontitle yysyntax yysymbol ARGMAX cond
|
@c LocalWords: xrefautomaticsectiontitle yysyntax yysymbol ARGMAX cond RTTI
|
||||||
@c LocalWords: Wdangling yytoken erreur syntaxe inattendu attendait nombre
|
@c LocalWords: Wdangling yytoken erreur syntaxe inattendu attendait nombre
|
||||||
@c LocalWords: YYUNDEF SymbolKind yypcontext YYENOMEM TOKENMAX getBundle
|
@c LocalWords: YYUNDEF SymbolKind yypcontext YYENOMEM TOKENMAX getBundle
|
||||||
@c LocalWords: ResourceBundle myResources getString getName getToken
|
@c LocalWords: ResourceBundle myResources getString getName getToken ylwrap
|
||||||
@c LocalWords: getLocation getExpectedTokens reportSyntaxError bistromathic
|
@c LocalWords: getLocation getExpectedTokens reportSyntaxError bistromathic
|
||||||
@c LocalWords: TokenKind
|
@c LocalWords: TokenKind Automake's rtti Wcounterexamples Chinawat PLDI
|
||||||
|
@c LocalWords: Isradisaikul
|
||||||
|
|
||||||
@c Local Variables:
|
@c Local Variables:
|
||||||
@c ispell-dictionary: "american"
|
@c ispell-dictionary: "american"
|
||||||
|
|||||||
Reference in New Issue
Block a user