Commit Graph

641 Commits

Author SHA1 Message Date
Akim Demaille
b4f1840114 Get rid of yyrhs and yyprhs in larl1.java.
* data/lalr1.java (yyrhs_, yyprhs_): Remove.
	(yy_reduce_print): Rather, use yystos_ and the state stack.
2008-11-25 22:50:10 +01:00
Akim Demaille
68dbdee86a Get rid of yyrhs and yyprhs in yacc.c.
They were used to get the symbol types, given a rule number,  when
displaying the top of the stack before a reduction.  But the symbol type
is available from the state stack.  This has two be benefits: two tables
less in the parser (making it smaller), and a more consistent use of the
three stacks which will help to fuse them.

	* data/yacc.c (yyprhs, yyrhs): Remove.
	(YY_REDUCE_PRINT): Pass yyssp to yy_reduce_print.
	(yy_reduce_print): Take yyssp as argument.
	Use it, together with yystos, to get the symbol type.
	* tests/regression.at (Web2c Report): Remove these tables from the
	expected output.
2008-11-25 22:21:24 +01:00
Akim Demaille
6ab1adbe1e b4_tables_map.
The point is to factor the generation of the tables across skeletons.
This is language dependant.

	* data/c.m4 (b4_comment_): New.
	Should be usable to define how to generate tables independently of
	the language.
	(b4_c_comment): New.
	(b4_comment): Bounce to b4_c_comment.
	Now support $2 = [PREFIX] for indentation.
	* data/lalr1.cc (b4_table_declare): Don't output a comment if
	there is no comment.
	Indent it properly when there is one.
	Output the ending semicolon.
	(b4_table_define): Space changes.
	Output the ending semicolon.
	(b4_tables_map): New.
	Use it twice instead of declaring and defining the (integral)
	tables by hand.
2008-11-25 22:18:09 +01:00
Akim Demaille
0fddb3d59f b4_table_declare.
* data/lalr1.cc (b4_table_declare): New.
	Use it to declare the tables defined with b4_table_define.
	(b4_table_define): Declare a third arg to match b4_table_declare
	signature.
	Move all the comments around invocations of b4_table_define into
	the invocations itselves.
	Move things around to have the order for declarations and
	definitions.
2008-11-25 22:14:39 +01:00
Akim Demaille
c4ddc0fb0b Formatting changes.
* data/lalr1.java: here.
2008-11-25 22:11:09 +01:00
Akim Demaille
e0c653e7e6 b4_args is more general than only C++.
* data/lalr1.cc (b4_args, _b4_args): Move to...
	* data/bison.m4: here.
2008-11-25 22:07:23 +01:00
Joel E. Denny
6c88b51e83 Fix unexpanded macros in GLR defines file.
Reported by Csaba Raduly at
<http://lists.gnu.org/archive/html/bug-bison/2008-11/msg00048.html>.
* THANKS (Csaba Raduly): Add.
* data/glr.c: Fix overquoting on b4_prefix for yylval and yylloc.
* tests/calc.at (_AT_DATA_CALC_Y): If %defines is specified, generate
lexer in a separate module that includes the defines file.
(AT_CHECK_CALC): From AT_FULL_COMPILE, request compilation of lexer
source.
* tests/local.at (_AT_BISON_OPTION_PUSHDEFS): Push AT_DEFINES_IF.
Adjust AT_LOC and AT_VAL to use AT_NAME_PREFIX.
(AT_BISON_OPTION_POPDEFS): Pop AT_DEFINES_IF.
(AT_DATA_SOURCE_PROLOGUE): New.
(AT_DATA_GRAMMAR_PROLOGUE): Use AT_DATA_SOURCE_PROLOGUE.
(AT_DATA_SOURCE): New.
(AT_FULL_COMPILE): Extend to support an additional source file.
2008-11-19 00:24:34 -05:00
Akim Demaille
bd187d7b65 Use b4_subtract where possible.
* data/lalr1.cc (b4_subtract): Move to...
	* data/bison.m4: here.
	* data/glr.c (b4_rhs_data): Use it.
	* data/yacc.c (b4_rhs_value, b4_rhs_location): Use it.
2008-11-18 20:57:26 +01:00
Akim Demaille
6085ab0d78 Remove incorrect mode specification.
* data/glr.cc: Don't pretend it's C code.
2008-11-18 20:53:21 +01:00
Akim Demaille
c5fc95d688 Update ignores.
* data/.cvsignore, data/.gitignore, examples/.cvsignore,
	* examples/.gitignore:
	Remove.
	* build-aux/.cvsignore, build-aux/.gitignore, doc/.cvsignore,
	* doc/.gitignore, etc/.cvsignore, etc/.gitignore, src/.cvsignore,
	* src/.gitignore, tests/.cvsignore, tests/.gitignore:
	Remove MAkefile and Makefile.in.
2008-11-16 19:46:16 +01:00
Akim Demaille
cb823b6f0c Support parametric types.
There are two issues to handle: first scanning nested angle bracket pairs
to support types such as std::pair< std::string, std::list<std::string> > >.

Another issue is to address idiosyncracies of C++: do not glue two closing
angle brackets together (otherwise it's operator>>), and avoid sticking
blindly a TYPE to the opening <, as it can result in '<:' which is a
digraph for '['.

	* src/scan-gram.l (brace_level): Rename as...
	(nesting): this.
	(SC_TAG): New.
	Implement support for complex tags.
	(tag): Accept \n, but not <.
	* data/lalr1.cc (b4_symbol_value, b4_symbol_value_template)
	(b4_symbol_variant): Leave space around types as parameters.
	* examples/variant.yy: Use nested template types and leading ::.
	* src/parse-gram.y (TYPE, TYPE_TAG_ANY, TYPE_TAG_NONE, type.opt):
	Rename as...
	(TAG, TAG_ANY, TAG_NONE, tag.opt): these.
	* tests/c++.at: Test parametric types.
2008-11-15 14:30:05 +01:00
Akim Demaille
7d3e21ba7b Formatting change. 2008-11-15 14:20:28 +01:00
Akim Demaille
b0d79ec65d Comment changes.
* data/local.mk, etc/local.mk, examples/local.mk: Use Automake
	comments for the license.
2008-11-15 11:23:53 +01:00
Akim Demaille
6a5aa0cdbb Remove data/Makefile.am.
* data/Makefile.am: Rename as...
	* data/local.mk: this.
	Adjust paths.
	* Makefile.am, configure.ac: Adjust.
2008-11-15 10:39:38 +01:00
Akim Demaille
0634493cdd Provide convenience constructors for locations and positions.
* data/location.cc (position::position): Accept file, line and
	column as arguments with default values.
	Always qualify initial line and column literals as unsigned.
	(location::location): Provide convenience constructors.
2008-11-15 10:23:51 +01:00
Akim Demaille
fe1b448ada Instead of using make_symbol<TOK_FOO>, generate make_FOO for each token type.
Using template buys us nothing, and makes it uselessly complex to
construct a symbol.  Besides, it could not be generalized to other
languages, while make_FOO would work in C/Java etc.

	* data/lalr1.cc (b4_symbol_): New.
	(b4_symbol): Use it.
	(b4_symbol_constructor_declaration_)
	(b4_symbol_constructor_definition_): Instead of generating
	specializations of an overloaded template function, just generate
	several functions whose names are forged from the token names
	without the token.prefix.
	(b4_symbol_constructor_declarations): Generate them for all the
	symbols, not just by class of symbol type, now that instead of
	specializing a function template by the token, we generate a
	function named after the token.
	(b4_symbol_constructor_specialization_)
	(b4_symbol_constructor_specializations): Remove.
	* etc/bench.pl.in: Adjust to this new API.
2008-11-15 10:20:02 +01:00
Akim Demaille
5679f31101 %define token.prefix.
Provide a means to add a prefix to the name of the tokens as output in the
generated files.  Because of name clashes, it is good to have such a
prefix such as TOK_ that protects from names such as EOF, FILE etc.
But it clutters the grammar itself.

	* data/bison.m4 (token.prefix): Empty by default.
	* data/c.m4 (b4_token_enum, b4_token_define): Use it.
	* data/lalr1.cc (b4_symbol): Ditto.
2008-11-13 07:08:24 +01:00
Akim Demaille
3204049e31 Compute at M4 time some of the subtractions.
* data/lalr1.cc (b4_substract): New.
	(b4_rhs_data): Use it.
2008-11-13 07:04:47 +01:00
Akim Demaille
202598d3ab symbol::token.
This is allows the user to get the type of a token return by
yylex.

	* data/lalr1.cc (symbol::token): New.
	(yytoknum_): Define when %define lex_symbol, independently of
	%debug.
	(yytoken_number_): Move into...
	(symbol::token): here, since that's the only use.
	The other one is YYPRINT which was not officially supported
	by lalr1.cc, and anyway it did not work since YYPRINT uses this
	array under a different name (yytoknum).
2008-11-13 07:01:41 +01:00
Akim Demaille
cb0b136a63 Comment changes.
* data/lalr1.cc, data/yacc.c: Fix the description of the
	yytranslate and yytoknum tables.
2008-11-13 06:52:05 +01:00
Akim Demaille
2c086d2959 Define make_symbol in the header.
To reach good performances these functions should be inlined (yet this is
to measure precisely).  To this end they must be available to the caller.

	* data/lalr1.cc (b4_symbol_constructor_definition_): Qualify
	location_type with the class name.
	Since will now be output in the header, declare "inline".
	No longer use b4_symbol_constructor_specializations, but
	b4_symbol_constructor_definitions in the header.
	Don't call it in the *.cc file.
2008-11-13 06:48:22 +01:00
Akim Demaille
1c4af3813e Define yytranslate in the header for lex_symbol.
* data/lalr1.cc: Move the invocation of b4_yytranslate_definition
	into the header file when using %define lex_symbol.
	(yytranslate_): Declare inline.
2008-11-13 06:44:50 +01:00
Akim Demaille
e51b0a82be Define the constructors of symbol_type in b4_symbol_constructor_definitions.
The constructors are called by the make_symbol functions, which a
forthcoming patch will move elsewhere.  Hence the interest of putting them
together.

The stack_symbol_type does not need to be moved, it is used only by the
parser.

	* data/lalr1.cc: Move symbol_type and symbol_base_type
	constructors into...
	(b4_symbol_constructor_definitions): here.
	Adjust.
2008-11-13 06:41:42 +01:00
Akim Demaille
788355718f Make it easier to move the definition of yytranslate_.
Forthcoming changes will make it possible to use yytranslate_
from outside the parser implementation file.

	* data/lalr1.cc (b4_yytranslate_definition): New.
	Use it.
2008-11-13 06:36:51 +01:00
Akim Demaille
c1e6c88ca3 Remove useless class specification.
* data/lalr1.cc (b4_symbol_constructor_specialization_): No need
	to refer to the class name to use a type defined by the class for
	arguments of member functions.
2008-11-13 06:33:51 +01:00
Akim Demaille
4654b0c0a8 Finer input type for yytranslate.
This patch is debatable: the tradition expects yylex to return an int
which happens to correspond to token_number (which is an enum).  This
allows for instance to return characters (such as '*' etc.).  But this
goes against the stronger typing I am trying to have with the new
lex interface which return a symbol_type.  So in this case, feed
yytranslate_ with a token_type.

	* data/lalr1.cc (yytranslate_): When in %define lex-symbol,
	expect a token_type.
2008-11-13 06:30:35 +01:00
Akim Demaille
dd735e4ee6 Honor lex-params in %define lex_symbol mode.
* data/lalr1.cc: Use b4_lex_param.
2008-11-13 06:27:15 +01:00
Akim Demaille
6659366cda Simplify names.
* src/output.c (symbol_definitions_output): Rename symbol
	attributes type_name and has_type_name as type and has_type.
	* data/lalr1.cc: Adjust uses.
2008-11-13 06:24:01 +01:00
Akim Demaille
e9805e5743 Use b4_type_names for the union type.
The union used to compute the size of the variant used to iterate over the
type of all the symbols, with a lot of redundancy.  Now iterate over the
lists of symbols having the same type-name.

	* data/lalr1.cc (b4_char_sizeof_): New.
	(b4_char_sizeof): Use it.
	Adjust to be called with a list of numbers instead of a single
	number.
	Adjust its caller for new-line issues.
2008-11-13 06:20:59 +01:00
Akim Demaille
aea10ef46f Define the "identifier" of a symbol.
Symbols may have several string representations, for instance if they
have an alias.  What I call its "id" is a string that can be used as
an identifier.  May not exist.

Currently the symbols which have the "tag_is_id" flag set are those that
don't have an alias.  Look harder for the id.

	* src/output.c (is_identifier): Move to...
	* src/symtab.c (is_identifier): here.
	* src/symtab.h, src/symtab.c (symbol_id_get): New.
	* src/output.c (symbol_definitions_output): Use it to define "id"
	and "has_id".
	Remove the definition of "tag_is_id".
	* data/lalr1.cc: Use the "id" and "has_id" whereever "tag" and
	"tag_is_id" were used to produce code.
	We still use "tag" for documentation.
2008-11-13 06:17:09 +01:00
Akim Demaille
2ea7730c56 Locations are no longer required by lalr1.cc.
* data/lalr1.cc (_b4_args, b4_args): New.
	Adjust all uses of locations to make them optional.
	* tests/c++.at (AT_CHECK_VARIANTS): No longer use the locations.
	(AT_CHECK_NAMESPACE): Check the use of locations.
	* tests/calc.at (_AT_DATA_CALC_Y): Adjust to be usable with or
	without locations with lalr1.cc.
	Test these cases.
	* tests/output.at: Check lalr1.cc with and without location
	support.
	* tests/regression.at (_AT_DATA_EXPECT2_Y, _AT_DATA_DANCER_Y):
	Don't use locations.
2008-11-11 16:38:10 +01:00
Akim Demaille
c944f7f22d Simplify lalr1.cc since %defines is mandatory.
* data/lalr1.cc: Remove useless calls to b4_defines_if.
2008-11-11 16:05:29 +01:00
Akim Demaille
422c18f48d Prefer M4 to CPP.
* data/lalr1.cc: Use b4_error_verbose_if instead of #if
	YYERROR_VERBOSE.
2008-11-11 15:59:05 +01:00
Akim Demaille
a0ffc1751e Support i18n of the parse error messages.
* TODO (lalr1.cc/I18n): Remove.
	* data/lalr1.cc (yysyntax_error_): Support the translation of the
	error messages, as done in yacc.c.
	Stay within the yy* pseudo namespace.
2008-11-11 15:55:54 +01:00
Akim Demaille
0927787504 Make it possible to return a symbol_type from yylex.
* data/lalr1.cc (b4_lex_symbol_if): New.
	(parse): When lex_symbol is defined, expected yylex to return the
	complete lookahead.
	* etc/bench.pl.in (generate_grammar_list): Extend to support this
	yylex interface.
	(bench_variant_parser): Exercise it.
2008-11-11 15:48:52 +01:00
Akim Demaille
39be90223b Replace yychar with a Boolean.
* data/lalr1.cc (parse::yychar): Replace by...
	(parse::yyempty): this.
2008-11-11 15:36:23 +01:00
Akim Demaille
aba12ad162 Let yytranslate handle the eof case.
* data/lalr1.cc (yytranslate_): Handle the EOF case.
	Adjust callers.
	No longer expect yychar to be equal to yyeof_, rather, test the
	lookahead's (translated) kind.
2008-11-11 15:29:39 +01:00
Akim Demaille
27cb5b5901 yychar cannot be empty in yyerrlab.
* TODO (yychar == yyempty_): New.
	* data/lalr1.cc: Remove the handling of this case.
	This eases forthcoming changes related to yychar and yytranslate.
2008-11-11 15:26:17 +01:00
Akim Demaille
2873fdf8b1 Introduce make_symbol.
make_symbol provides a means to construct a full symbol (kind, value,
location) in a single shot.  It is meant to be a Symbol constructor,
parameterized by the symbol kind so that overloading would prevent
incorrect kind/value pairs.  Unfortunately parameterized constructors do
not work well in C++ (unless the parameter also appears as an argument,
which is not acceptable), hence the use of a function instead of a
constructor.

	* data/lalr1.cc (b4_symbol_constructor_declaration_)
	(b4_symbol_constructor_declarations)
	(b4_symbol_constructor_specialization_)
	(b4_symbol_constructor_specializations)
	(b4_symbol_constructor_definition_)
	(b4_symbol_constructor_definitions): New.
	Use them where appropriate to generate declaration, declaration of
	the specializations, and implementations of the templated
	overloaded function "make_symbol".
	(variant::variant): Always define a default ctor.
	Also provide a copy ctor.
	(symbol_base_type, symbol_type): New ctor overloads for value-less
	symbols.
	(symbol_type): Now public, so that functions such as yylex can use
	it.
2008-11-11 15:16:53 +01:00
Akim Demaille
11707b2b48 Get rid of tabulations in the Java output.
Test 214 was failing: it greps with a pattern containing [    ]* which
obviously meant to catch spaces and tabs, but contained only tabs.
Tabulations in sources are a nuisance, so to simplify the matter, get rid
of all the tabulations in the Java sources.  The other skeletons will be
treated equally later.

	* data/java.m4, data/lalr1.java: Untabify.
	* tests/java.at: Simplify AT_CHECK_JAVA_GREP invocations:
	tabulations are no longer generated.
2008-11-11 14:42:35 +01:00
Di-an Jan
09ccae9b18 Work around Java's ``code too large'' problem for parser tables.
* data/java.m4 (b4_typed_parser_table, b4_integral_parser_table): New.
* data/lalr1.java (yypact_, yydefact_, yypgoto_, yydefgoto_,
yytable_, yycheck_, yystos_, yytoken_number_, yyr1_, yyr2_, yyrhs_
yyprhs_, yyrline_, yytranslate_table_): Use b4_integral_parser_table.
(yytname_): Use b4_typed_parser_table.
* doc/bison.texinfo (Java Bison Interface): Add note on Java's
``code too large'' error.
2008-11-10 14:34:53 +01:00
Di-an Jan
1979121c96 Various Java skeleton improvements.
* NEWS: Document them.

General Java skeleton improvements.
* configure.ac (gt_JAVACOMP): Request target of 1.4, which allows
using gcj < 4.3 in the testsuite, according to comments in
gnulib/m4/javacomp.m4.
* data/java.m4 (stype, parser_class_name, lex_throws, throws,
location_type, position_type): Remove extraneous brackets from
b4_percent_define_default.
(b4_lex_param, b4_parse_param): Remove extraneous brackets from
m4_define and m4_define_default.
* data/lalr1.java (b4_pre_prologue): Change to b4_user_post_prologue,
which marks the end of user code with appropriate syncline, like all
the other skeletons.
(b4_user_post_prologue): Add.  Don't silently drop.
(yylex): Remove.
(parse): Inline yylex.
* doc/bison.texinfo (bisonVersion, bisonSkeleton): Document.
(%{...%}): Fix typo of %code imports.
* tests/java.at (AT_JAVA_COMPILE): Add "java" keyword.
Support annotations on parser class with %define annotations.
* data/lalr1.java (annotations): Add to parser class modifier.
* doc/bison.texinfo (Java Parser Interface): Document
%define annotations.
(Java Declarations Summary): Document %define annotations.
* tests/java.at (Java parser class modifiers): Test annotations.
Do not generate code for %error-verbose unless requested.
* data/lalr1.java (errorVerbose): Rename to yyErrorVerbose.
Make private.  Make conditional on %error-verbose.
(getErrorVerbose, setErrorVerbose): New.
(yytnamerr_): Make conditional on %error-verbose.
(yysyntax_error): Make some code conditional on %error-verbose.
* doc/bison.texinfo (Java Bison Interface): Remove the parts
about %error-verbose having no effect.
(getErrorVerbose, setErrorVerbose): Document.
Move constants for token names to Lexer interface.
* data/lalr1.java (Lexer): Move EOF, b4_token_enums(b4_tokens) here.
* data/java.m4 (b4_token_enum): Indent for move to Lexer interface.
(parse): Qualify EOF to Lexer.EOF.
* doc/bison.texinfo (Java Parser Interface): Move documentation of
EOF and token names to Java Lexer Interface.
* tests/java.at (_AT_DATA_JAVA_CALC_Y): Remove Calc qualifier.
Make yyerror public.
* data/lalr1.java (Lexer.yyerror): Use longer parameter name.
(yyerror): Change to public.  Add Javadoc comments.  Use longer
parameter names.  Make the body rather than the declarator
conditional on %locations.
* doc/bison.texinfo (yyerror): Document.  Don't mark as protected.
Allow user to add code to the constructor with %code init.
* data/java.m4 (b4_init_throws): New, for %define init_throws.
* data/lalr1.java (YYParser.YYParser): Add b4_init_throws.
Add %code init to the front of the constructor body.
* doc/bison.texinfo (YYParser.YYParser): Document %code init
and %define init_throws.
(Java Declarations Summary): Document %code init and
%define init_throws.
* tests/java.at (Java %parse-param and %lex-param): Adjust grep.
(Java constructor init and init_throws): Add tests.
2008-11-10 14:34:52 +01:00
Akim Demaille
247efe346c Formatting changes. 2008-11-10 12:01:19 +01:00
Akim Demaille
5d73144067 More information about the symbols.
* src/output.c (type_names_output): Document all the symbols,
	including those that don't have a type-name.
	(symbol_definitions_output): Define "is_token" and
	"has_type_name".
	* data/lalr1.cc (b4_type_action_): Skip symbols that have an empty
	type-name, now that they are defined too in b4_type_names.
2008-11-10 11:58:01 +01:00
Akim Demaille
6ed15cde29 Make parser::yytranslate static.
Small speedup (1%) on the list grammar.  And makes yytranslate_ available
in non member functions.

	* data/lalr1.cc (yytranslate_): Does not need to be a instance
	function.
2008-11-10 11:50:57 +01:00
Akim Demaille
30bb2edccf Avoid trailing spaces.
* data/c.m4: b4_comment(TEXT): Don't indent empty lines.
	* data/lalr1.cc: Don't indent before rule and symbol actions, as
	they can be empty, and anyway this incorrectly indents the first
	action.
2008-11-10 11:47:49 +01:00
Akim Demaille
914202bdac Use "enum" for integral constants.
This is just nicer to read, I observed no speedup.

	* data/lalr1.cc (yyeof_, yylast_, yynnts_, yyempty_, yyfinal_)
	(yterror_, yyerrcode_, yyntokens_): Define as members of an enum.
	(yyuser_token_number_max_, yyundef_token_): Move into...
	(yytranslate_): here.
2008-11-10 11:41:00 +01:00
Akim Demaille
b9855ea55b Formatting changes.
* data/lalr1.cc: here.
2008-11-10 11:32:12 +01:00
Akim Demaille
4c3cc7da5d Classify symbols by type-name.
* src/uniqstr.h (UNIQSTR_CMP): New.
	* src/output.c (symbol_type_name_cmp, symbols_by_type_name)
	(type_names_output): New.
	(muscles_output): Use it.
	* data/lalr1.cc (b4_symbol_action_): Remove.
	(b4_symbol_case_, b4_type_action_): New.
	Adjust uses of b4_symbol_action_ to use b4_type_action_.
2008-11-10 11:25:36 +01:00
Akim Demaille
d69c9694a7 Change the handling of the symbols in the skeletons.
Before we were using tables which lines were the symbols and which
columns were things like number, tag, type-name etc.  It is was
difficult to extend: each time a column was added, all the numbers had
to be updated (you asked for colon $2, not for "tag").  Also, it was
hard to filter these tables when only a subset of the symbols (say the
tokens, or the nterms, or the tokens that have and external number
*and* a type-name) was of interest.

Now instead of monolithic tables, we define one macro per cell.  For
instance "b4_symbol(0, tag)" is a macro name which contents is
self-decriptive.  The macro "b4_symbol" provides easier access to
these cells.

	* src/output.c (type_names_output): Remove.
	(symbol_numbers_output, symbol_definitions_output): New.
	(muscles_output): Call them.
	(prepare_symbols): Define b4_symbols_number.
2008-11-10 11:21:50 +01:00