mirror of
https://git.savannah.gnu.org/git/bison.git
synced 2026-03-09 12:23:04 +00:00
Do not allow identifiers that start with a dash.
This cleans up our previous fixes for a bug whereby Bison
discarded `.field' in `$-1.field'. The previous fixes were less
restrictive about where a dash could appear in an identifier, but
the restrictions were hard to explain. That bug was reported and
this final fix was originally suggested by Paul Hilfinger. This
also fixes a remaining bug reported by Paul Eggert whereby Bison
parses `%token ID -123' as `%token ID - 123' and handles `-' as an
identifier. Now, `-' cannot be an identifier. Discussed in
threads beginning at
<http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00000.html>,
<http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00004.html>.
* NEWS (2.5): Update entry describing the dash extension to
grammar symbol names. Also, move that entry before the named
references entry because the latter mentions the former.
* doc/bison.texinfo (Symbol): Update documentation for symbol
names. As suggested by Paul Eggert, mention the effect of periods
and dashes on named references.
(Decl Summary): Update documentation for unquoted %define values,
which, as a side effect, can no longer start with dashes either.
* src/scan-code.l (id): Implement.
* src/scan-gram.l (id): Implement.
* tests/actions.at (Exotic Dollars): Extend test group to exercise
bug reported by Paul Hilfinger.
* tests/input.at (Symbols): Update test group, and extend to
exercise bug reported by Paul Eggert.
* tests/named-refs.at (Stray symbols in brackets): Update test
group.
($ or @ followed by . or -): Likewise.
* tests/regression.at (Invalid inputs): Likewise.
(cherry picked from commit 82f3355eaf)
This commit is contained in:
33
ChangeLog
33
ChangeLog
@@ -1,3 +1,36 @@
|
|||||||
|
2011-01-29 Joel E. Denny <joeldenny@joeldenny.org>
|
||||||
|
|
||||||
|
Do not allow identifiers that start with a dash.
|
||||||
|
This cleans up our previous fixes for a bug whereby Bison
|
||||||
|
discarded `.field' in `$-1.field'. The previous fixes were less
|
||||||
|
restrictive about where a dash could appear in an identifier, but
|
||||||
|
the restrictions were hard to explain. That bug was reported and
|
||||||
|
this final fix was originally suggested by Paul Hilfinger. This
|
||||||
|
also fixes a remaining bug reported by Paul Eggert whereby Bison
|
||||||
|
parses `%token ID -123' as `%token ID - 123' and handles `-' as an
|
||||||
|
identifier. Now, `-' cannot be an identifier. Discussed in
|
||||||
|
threads beginning at
|
||||||
|
<http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00000.html>,
|
||||||
|
<http://lists.gnu.org/archive/html/bug-bison/2011-01/msg00004.html>.
|
||||||
|
* NEWS (2.5): Update entry describing the dash extension to
|
||||||
|
grammar symbol names. Also, move that entry before the named
|
||||||
|
references entry because the latter mentions the former.
|
||||||
|
* doc/bison.texinfo (Symbol): Update documentation for symbol
|
||||||
|
names. As suggested by Paul Eggert, mention the effect of periods
|
||||||
|
and dashes on named references.
|
||||||
|
(Decl Summary): Update documentation for unquoted %define values,
|
||||||
|
which, as a side effect, can no longer start with dashes either.
|
||||||
|
* src/scan-code.l (id): Implement.
|
||||||
|
* src/scan-gram.l (id): Implement.
|
||||||
|
* tests/actions.at (Exotic Dollars): Extend test group to exercise
|
||||||
|
bug reported by Paul Hilfinger.
|
||||||
|
* tests/input.at (Symbols): Update test group, and extend to
|
||||||
|
exercise bug reported by Paul Eggert.
|
||||||
|
* tests/named-refs.at (Stray symbols in brackets): Update test
|
||||||
|
group.
|
||||||
|
($ or @ followed by . or -): Likewise.
|
||||||
|
* tests/regression.at (Invalid inputs): Likewise.
|
||||||
|
|
||||||
2011-01-24 Joel E. Denny <joeldenny@joeldenny.org>
|
2011-01-24 Joel E. Denny <joeldenny@joeldenny.org>
|
||||||
|
|
||||||
* data/yacc.c: Fix last apostrophe warning from xgettext.
|
* data/yacc.c: Fix last apostrophe warning from xgettext.
|
||||||
|
|||||||
16
NEWS
16
NEWS
@@ -3,6 +3,14 @@ Bison News
|
|||||||
|
|
||||||
* Changes in version 2.5 (????-??-??):
|
* Changes in version 2.5 (????-??-??):
|
||||||
|
|
||||||
|
** Grammar symbol names can now contain non-initial dashes:
|
||||||
|
|
||||||
|
Consistently with directives (such as %error-verbose) and with
|
||||||
|
%define variables (e.g. push-pull), grammar symbol names may contain
|
||||||
|
dashes in any position except the beginning. This is a GNU
|
||||||
|
extension over POSIX Yacc. Thus, use of this extension is reported
|
||||||
|
by -Wyacc and rejected in Yacc mode (--yacc).
|
||||||
|
|
||||||
** Named references:
|
** Named references:
|
||||||
|
|
||||||
Historically, Yacc and Bison have supported positional references
|
Historically, Yacc and Bison have supported positional references
|
||||||
@@ -98,14 +106,6 @@ Bison News
|
|||||||
LAC is an experimental feature. More user feedback will help to
|
LAC is an experimental feature. More user feedback will help to
|
||||||
stabilize it.
|
stabilize it.
|
||||||
|
|
||||||
** Grammar symbol names can now contain dashes:
|
|
||||||
|
|
||||||
Consistently with directives (such as %error-verbose) and variables
|
|
||||||
(e.g. push-pull), grammar symbol names may include dashes in any
|
|
||||||
position, similarly to periods and underscores. This is GNU
|
|
||||||
extension over POSIX Yacc whose use is reported by -Wyacc, and
|
|
||||||
rejected in Yacc mode (--yacc).
|
|
||||||
|
|
||||||
** %define improvements:
|
** %define improvements:
|
||||||
|
|
||||||
*** Can now be invoked via the command line:
|
*** Can now be invoked via the command line:
|
||||||
|
|||||||
@@ -3049,12 +3049,13 @@ A @dfn{nonterminal symbol} stands for a class of syntactically
|
|||||||
equivalent groupings. The symbol name is used in writing grammar rules.
|
equivalent groupings. The symbol name is used in writing grammar rules.
|
||||||
By convention, it should be all lower case.
|
By convention, it should be all lower case.
|
||||||
|
|
||||||
Symbol names can contain letters, underscores, periods, dashes, and (not
|
Symbol names can contain letters, underscores, periods, and non-initial
|
||||||
at the beginning) digits. Dashes in symbol names are a GNU
|
digits and dashes. Dashes in symbol names are a GNU extension, incompatible
|
||||||
extension, incompatible with POSIX Yacc. Terminal symbols
|
with POSIX Yacc. Periods and dashes make symbol names less convenient to
|
||||||
that contain periods or dashes make little sense: since they are not
|
use with named references, which require brackets around such names
|
||||||
valid symbols (in most programming languages) they are not exported as
|
(@pxref{Named References}). Terminal symbols that contain periods or dashes
|
||||||
token names.
|
make little sense: since they are not valid symbols (in most programming
|
||||||
|
languages) they are not exported as token names.
|
||||||
|
|
||||||
There are three ways of writing terminal symbols in the grammar:
|
There are three ways of writing terminal symbols in the grammar:
|
||||||
|
|
||||||
@@ -4959,9 +4960,8 @@ Define a variable to adjust Bison's behavior.
|
|||||||
It is an error if a @var{variable} is defined by @code{%define} multiple
|
It is an error if a @var{variable} is defined by @code{%define} multiple
|
||||||
times, but see @ref{Bison Options,,-D @var{name}[=@var{value}]}.
|
times, but see @ref{Bison Options,,-D @var{name}[=@var{value}]}.
|
||||||
|
|
||||||
@var{value} must be placed in quotation marks if it contains any
|
@var{value} must be placed in quotation marks if it contains any character
|
||||||
character other than a letter, underscore, period, dash, or non-initial
|
other than a letter, underscore, period, or non-initial dash or digit.
|
||||||
digit.
|
|
||||||
|
|
||||||
Omitting @code{"@var{value}"} entirely is always equivalent to specifying
|
Omitting @code{"@var{value}"} entirely is always equivalent to specifying
|
||||||
@code{""}.
|
@code{""}.
|
||||||
|
|||||||
@@ -85,7 +85,7 @@ splice (\\[ \f\t\v]*\n)*
|
|||||||
named symbol references. Shall be kept synchronized with
|
named symbol references. Shall be kept synchronized with
|
||||||
scan-gram.l "letter" and "id". */
|
scan-gram.l "letter" and "id". */
|
||||||
letter [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
|
letter [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
|
||||||
id -*(-|{letter}({letter}|[-0-9])*)
|
id {letter}({letter}|[-0-9])*
|
||||||
ref -?[0-9]+|{id}|"["{id}"]"|"$"
|
ref -?[0-9]+|{id}|"["{id}"]"|"$"
|
||||||
|
|
||||||
%%
|
%%
|
||||||
|
|||||||
@@ -104,7 +104,7 @@ static void unexpected_newline (boundary, char const *);
|
|||||||
%x SC_BRACKETED_ID SC_RETURN_BRACKETED_ID
|
%x SC_BRACKETED_ID SC_RETURN_BRACKETED_ID
|
||||||
|
|
||||||
letter [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
|
letter [.abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_]
|
||||||
id -*(-|{letter}({letter}|[-0-9])*)
|
id {letter}({letter}|[-0-9])*
|
||||||
directive %{id}
|
directive %{id}
|
||||||
int [0-9]+
|
int [0-9]+
|
||||||
|
|
||||||
|
|||||||
@@ -158,6 +158,52 @@ AT_PARSER_CHECK([./input], 0,
|
|||||||
[[15
|
[[15
|
||||||
]])
|
]])
|
||||||
|
|
||||||
|
# Make sure that fields after $n or $-n are parsed correctly. At one
|
||||||
|
# point while implementing dashes in symbol names, we were dropping
|
||||||
|
# fields after $-n.
|
||||||
|
AT_DATA_GRAMMAR([[input.y]],
|
||||||
|
[[
|
||||||
|
%{
|
||||||
|
# include <stdio.h>
|
||||||
|
static int yylex (void);
|
||||||
|
static void yyerror (char const *msg);
|
||||||
|
typedef struct { int val; } stype;
|
||||||
|
# define YYSTYPE stype
|
||||||
|
%}
|
||||||
|
|
||||||
|
%%
|
||||||
|
start: one two { $$.val = $1.val + $2.val; } sum ;
|
||||||
|
one: { $$.val = 1; } ;
|
||||||
|
two: { $$.val = 2; } ;
|
||||||
|
sum: { printf ("%d\n", $0.val + $-1.val + $-2.val); } ;
|
||||||
|
|
||||||
|
%%
|
||||||
|
|
||||||
|
static int
|
||||||
|
yylex (void)
|
||||||
|
{
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void
|
||||||
|
yyerror (char const *msg)
|
||||||
|
{
|
||||||
|
fprintf (stderr, "%s\n", msg);
|
||||||
|
}
|
||||||
|
|
||||||
|
int
|
||||||
|
main (void)
|
||||||
|
{
|
||||||
|
return yyparse ();
|
||||||
|
}
|
||||||
|
]])
|
||||||
|
|
||||||
|
AT_BISON_CHECK([[-o input.c input.y]])
|
||||||
|
AT_COMPILE([[input]])
|
||||||
|
AT_PARSER_CHECK([[./input]], [[0]],
|
||||||
|
[[6
|
||||||
|
]])
|
||||||
|
|
||||||
AT_CLEANUP
|
AT_CLEANUP
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -658,17 +658,20 @@ AT_BISON_CHECK([-o input.c input.y])
|
|||||||
AT_COMPILE([input.o], [-c input.c])
|
AT_COMPILE([input.o], [-c input.c])
|
||||||
|
|
||||||
|
|
||||||
# Periods and dashes are genuine letters, they can start identifiers.
|
# Periods are genuine letters, they can start identifiers.
|
||||||
# Digits cannot.
|
# Digits and dashes cannot.
|
||||||
AT_DATA_GRAMMAR([input.y],
|
AT_DATA_GRAMMAR([input.y],
|
||||||
[[%token .GOOD
|
[[%token .GOOD
|
||||||
-GOOD
|
-GOOD
|
||||||
1NV4L1D
|
1NV4L1D
|
||||||
|
-123
|
||||||
%%
|
%%
|
||||||
start: .GOOD -GOOD
|
start: .GOOD GOOD
|
||||||
]])
|
]])
|
||||||
AT_BISON_CHECK([-o input.c input.y], [1], [],
|
AT_BISON_CHECK([-o input.c input.y], [1], [],
|
||||||
[[input.y:11.10-16: invalid identifier: `1NV4L1D'
|
[[input.y:10.10: invalid character: `-'
|
||||||
|
input.y:11.10-16: invalid identifier: `1NV4L1D'
|
||||||
|
input.y:12.10: invalid character: `-'
|
||||||
]])
|
]])
|
||||||
|
|
||||||
AT_CLEANUP
|
AT_CLEANUP
|
||||||
|
|||||||
@@ -446,13 +446,14 @@ AT_SETUP([Stray symbols in brackets])
|
|||||||
AT_DATA_GRAMMAR([test.y],
|
AT_DATA_GRAMMAR([test.y],
|
||||||
[[
|
[[
|
||||||
%%
|
%%
|
||||||
start: foo[ /* aaa */ *&-+ ] bar
|
start: foo[ /* aaa */ *&-.+ ] bar
|
||||||
{ s = $foo; }
|
{ s = $foo; }
|
||||||
]])
|
]])
|
||||||
AT_BISON_CHECK([-o test.c test.y], 1, [],
|
AT_BISON_CHECK([-o test.c test.y], 1, [],
|
||||||
[[test.y:11.23: invalid character in bracketed name: `*'
|
[[test.y:11.23: invalid character in bracketed name: `*'
|
||||||
test.y:11.24: invalid character in bracketed name: `&'
|
test.y:11.24: invalid character in bracketed name: `&'
|
||||||
test.y:11.26: invalid character in bracketed name: `+'
|
test.y:11.25: invalid character in bracketed name: `-'
|
||||||
|
test.y:11.27: invalid character in bracketed name: `+'
|
||||||
]])
|
]])
|
||||||
AT_CLEANUP
|
AT_CLEANUP
|
||||||
|
|
||||||
@@ -570,23 +571,27 @@ AT_DATA([[test.y]],
|
|||||||
%%
|
%%
|
||||||
start:
|
start:
|
||||||
.field { $.field; }
|
.field { $.field; }
|
||||||
| -field { @-field; }
|
|
||||||
| 'a' { @.field; }
|
| 'a' { @.field; }
|
||||||
| 'a' { $-field; }
|
|
||||||
;
|
;
|
||||||
.field: ;
|
.field: ;
|
||||||
-field: ;
|
|
||||||
]])
|
]])
|
||||||
AT_BISON_CHECK([[test.y]], [[1]], [],
|
AT_BISON_CHECK([[test.y]], [[1]], [],
|
||||||
[[test.y:4.12-18: invalid reference: `$.field'
|
[[test.y:4.12-18: invalid reference: `$.field'
|
||||||
test.y:4.13: syntax error after `$', expecting integer, letter, `_', `@<:@', or `$'
|
test.y:4.13: syntax error after `$', expecting integer, letter, `_', `@<:@', or `$'
|
||||||
test.y:4.3-8: possibly meant: $[.field] at $1
|
test.y:4.3-8: possibly meant: $[.field] at $1
|
||||||
test.y:5.12-18: invalid reference: `@-field'
|
test.y:5.12-18: invalid reference: `@.field'
|
||||||
test.y:5.13: syntax error after `@', expecting integer, letter, `_', `@<:@', or `$'
|
test.y:5.13: syntax error after `@', expecting integer, letter, `_', `@<:@', or `$'
|
||||||
test.y:5.3-8: possibly meant: @[-field] at $1
|
]])
|
||||||
test.y:6.12-18: invalid reference: `@.field'
|
AT_DATA([[test.y]],
|
||||||
test.y:6.13: syntax error after `@', expecting integer, letter, `_', `@<:@', or `$'
|
[[
|
||||||
test.y:7.12-18: invalid reference: `$-field'
|
%%
|
||||||
test.y:7.13: syntax error after `$', expecting integer, letter, `_', `@<:@', or `$'
|
start:
|
||||||
|
'a' { $-field; }
|
||||||
|
| 'b' { @-field; }
|
||||||
|
;
|
||||||
|
]])
|
||||||
|
AT_BISON_CHECK([[test.y]], [[0]], [],
|
||||||
|
[[test.y:4.9: warning: stray `$'
|
||||||
|
test.y:5.9: warning: stray `@'
|
||||||
]])
|
]])
|
||||||
AT_CLEANUP
|
AT_CLEANUP
|
||||||
|
|||||||
@@ -392,7 +392,8 @@ input.y:3.14: invalid character: `}'
|
|||||||
input.y:4.1: invalid character: `%'
|
input.y:4.1: invalid character: `%'
|
||||||
input.y:4.2: invalid character: `&'
|
input.y:4.2: invalid character: `&'
|
||||||
input.y:5.1-17: invalid directive: `%a-does-not-exist'
|
input.y:5.1-17: invalid directive: `%a-does-not-exist'
|
||||||
input.y:6.1-2: invalid directive: `%-'
|
input.y:6.1: invalid character: `%'
|
||||||
|
input.y:6.2: invalid character: `-'
|
||||||
input.y:7.1-8.0: missing `%}' at end of file
|
input.y:7.1-8.0: missing `%}' at end of file
|
||||||
input.y:7.1-8.0: syntax error, unexpected %{...%}
|
input.y:7.1-8.0: syntax error, unexpected %{...%}
|
||||||
]])
|
]])
|
||||||
|
|||||||
Reference in New Issue
Block a user