Commit Graph

603 Commits

Author SHA1 Message Date
Akim Demaille
2873fdf8b1 Introduce make_symbol.
make_symbol provides a means to construct a full symbol (kind, value,
location) in a single shot.  It is meant to be a Symbol constructor,
parameterized by the symbol kind so that overloading would prevent
incorrect kind/value pairs.  Unfortunately parameterized constructors do
not work well in C++ (unless the parameter also appears as an argument,
which is not acceptable), hence the use of a function instead of a
constructor.

	* data/lalr1.cc (b4_symbol_constructor_declaration_)
	(b4_symbol_constructor_declarations)
	(b4_symbol_constructor_specialization_)
	(b4_symbol_constructor_specializations)
	(b4_symbol_constructor_definition_)
	(b4_symbol_constructor_definitions): New.
	Use them where appropriate to generate declaration, declaration of
	the specializations, and implementations of the templated
	overloaded function "make_symbol".
	(variant::variant): Always define a default ctor.
	Also provide a copy ctor.
	(symbol_base_type, symbol_type): New ctor overloads for value-less
	symbols.
	(symbol_type): Now public, so that functions such as yylex can use
	it.
2008-11-11 15:16:53 +01:00
Akim Demaille
11707b2b48 Get rid of tabulations in the Java output.
Test 214 was failing: it greps with a pattern containing [    ]* which
obviously meant to catch spaces and tabs, but contained only tabs.
Tabulations in sources are a nuisance, so to simplify the matter, get rid
of all the tabulations in the Java sources.  The other skeletons will be
treated equally later.

	* data/java.m4, data/lalr1.java: Untabify.
	* tests/java.at: Simplify AT_CHECK_JAVA_GREP invocations:
	tabulations are no longer generated.
2008-11-11 14:42:35 +01:00
Di-an Jan
09ccae9b18 Work around Java's ``code too large'' problem for parser tables.
* data/java.m4 (b4_typed_parser_table, b4_integral_parser_table): New.
* data/lalr1.java (yypact_, yydefact_, yypgoto_, yydefgoto_,
yytable_, yycheck_, yystos_, yytoken_number_, yyr1_, yyr2_, yyrhs_
yyprhs_, yyrline_, yytranslate_table_): Use b4_integral_parser_table.
(yytname_): Use b4_typed_parser_table.
* doc/bison.texinfo (Java Bison Interface): Add note on Java's
``code too large'' error.
2008-11-10 14:34:53 +01:00
Di-an Jan
1979121c96 Various Java skeleton improvements.
* NEWS: Document them.

General Java skeleton improvements.
* configure.ac (gt_JAVACOMP): Request target of 1.4, which allows
using gcj < 4.3 in the testsuite, according to comments in
gnulib/m4/javacomp.m4.
* data/java.m4 (stype, parser_class_name, lex_throws, throws,
location_type, position_type): Remove extraneous brackets from
b4_percent_define_default.
(b4_lex_param, b4_parse_param): Remove extraneous brackets from
m4_define and m4_define_default.
* data/lalr1.java (b4_pre_prologue): Change to b4_user_post_prologue,
which marks the end of user code with appropriate syncline, like all
the other skeletons.
(b4_user_post_prologue): Add.  Don't silently drop.
(yylex): Remove.
(parse): Inline yylex.
* doc/bison.texinfo (bisonVersion, bisonSkeleton): Document.
(%{...%}): Fix typo of %code imports.
* tests/java.at (AT_JAVA_COMPILE): Add "java" keyword.
Support annotations on parser class with %define annotations.
* data/lalr1.java (annotations): Add to parser class modifier.
* doc/bison.texinfo (Java Parser Interface): Document
%define annotations.
(Java Declarations Summary): Document %define annotations.
* tests/java.at (Java parser class modifiers): Test annotations.
Do not generate code for %error-verbose unless requested.
* data/lalr1.java (errorVerbose): Rename to yyErrorVerbose.
Make private.  Make conditional on %error-verbose.
(getErrorVerbose, setErrorVerbose): New.
(yytnamerr_): Make conditional on %error-verbose.
(yysyntax_error): Make some code conditional on %error-verbose.
* doc/bison.texinfo (Java Bison Interface): Remove the parts
about %error-verbose having no effect.
(getErrorVerbose, setErrorVerbose): Document.
Move constants for token names to Lexer interface.
* data/lalr1.java (Lexer): Move EOF, b4_token_enums(b4_tokens) here.
* data/java.m4 (b4_token_enum): Indent for move to Lexer interface.
(parse): Qualify EOF to Lexer.EOF.
* doc/bison.texinfo (Java Parser Interface): Move documentation of
EOF and token names to Java Lexer Interface.
* tests/java.at (_AT_DATA_JAVA_CALC_Y): Remove Calc qualifier.
Make yyerror public.
* data/lalr1.java (Lexer.yyerror): Use longer parameter name.
(yyerror): Change to public.  Add Javadoc comments.  Use longer
parameter names.  Make the body rather than the declarator
conditional on %locations.
* doc/bison.texinfo (yyerror): Document.  Don't mark as protected.
Allow user to add code to the constructor with %code init.
* data/java.m4 (b4_init_throws): New, for %define init_throws.
* data/lalr1.java (YYParser.YYParser): Add b4_init_throws.
Add %code init to the front of the constructor body.
* doc/bison.texinfo (YYParser.YYParser): Document %code init
and %define init_throws.
(Java Declarations Summary): Document %code init and
%define init_throws.
* tests/java.at (Java %parse-param and %lex-param): Adjust grep.
(Java constructor init and init_throws): Add tests.
2008-11-10 14:34:52 +01:00
Akim Demaille
247efe346c Formatting changes. 2008-11-10 12:01:19 +01:00
Akim Demaille
5d73144067 More information about the symbols.
* src/output.c (type_names_output): Document all the symbols,
	including those that don't have a type-name.
	(symbol_definitions_output): Define "is_token" and
	"has_type_name".
	* data/lalr1.cc (b4_type_action_): Skip symbols that have an empty
	type-name, now that they are defined too in b4_type_names.
2008-11-10 11:58:01 +01:00
Akim Demaille
6ed15cde29 Make parser::yytranslate static.
Small speedup (1%) on the list grammar.  And makes yytranslate_ available
in non member functions.

	* data/lalr1.cc (yytranslate_): Does not need to be a instance
	function.
2008-11-10 11:50:57 +01:00
Akim Demaille
30bb2edccf Avoid trailing spaces.
* data/c.m4: b4_comment(TEXT): Don't indent empty lines.
	* data/lalr1.cc: Don't indent before rule and symbol actions, as
	they can be empty, and anyway this incorrectly indents the first
	action.
2008-11-10 11:47:49 +01:00
Akim Demaille
914202bdac Use "enum" for integral constants.
This is just nicer to read, I observed no speedup.

	* data/lalr1.cc (yyeof_, yylast_, yynnts_, yyempty_, yyfinal_)
	(yterror_, yyerrcode_, yyntokens_): Define as members of an enum.
	(yyuser_token_number_max_, yyundef_token_): Move into...
	(yytranslate_): here.
2008-11-10 11:41:00 +01:00
Akim Demaille
b9855ea55b Formatting changes.
* data/lalr1.cc: here.
2008-11-10 11:32:12 +01:00
Akim Demaille
4c3cc7da5d Classify symbols by type-name.
* src/uniqstr.h (UNIQSTR_CMP): New.
	* src/output.c (symbol_type_name_cmp, symbols_by_type_name)
	(type_names_output): New.
	(muscles_output): Use it.
	* data/lalr1.cc (b4_symbol_action_): Remove.
	(b4_symbol_case_, b4_type_action_): New.
	Adjust uses of b4_symbol_action_ to use b4_type_action_.
2008-11-10 11:25:36 +01:00
Akim Demaille
d69c9694a7 Change the handling of the symbols in the skeletons.
Before we were using tables which lines were the symbols and which
columns were things like number, tag, type-name etc.  It is was
difficult to extend: each time a column was added, all the numbers had
to be updated (you asked for colon $2, not for "tag").  Also, it was
hard to filter these tables when only a subset of the symbols (say the
tokens, or the nterms, or the tokens that have and external number
*and* a type-name) was of interest.

Now instead of monolithic tables, we define one macro per cell.  For
instance "b4_symbol(0, tag)" is a macro name which contents is
self-decriptive.  The macro "b4_symbol" provides easier access to
these cells.

	* src/output.c (type_names_output): Remove.
	(symbol_numbers_output, symbol_definitions_output): New.
	(muscles_output): Call them.
	(prepare_symbols): Define b4_symbols_number.
2008-11-10 11:21:50 +01:00
Akim Demaille
e5eb92e794 Support constructor with an argument.
This improves the "list" bench by 2%.

	* data/lalr1.cc (variant::build): Add an overloaded version with
	an argument.
	* tests/c++.at (AT_CHECK_VARIANT): Check it.
2008-11-10 11:04:31 +01:00
Akim Demaille
5de9c59301 Use a static hierarchy for symbols in the C++ parser.
* data/lalr1.cc (symbol_base_type, symbol_type)
	(stack_symbol_type): Make it a static hierarchy.
	Adjust dependencies.
2008-11-09 19:57:30 +01:00
Akim Demaille
d3be4f6d42 Use inline for small operations.
* data/lalr1.cc (symbol_base_type, symbol_type)
	(stack_symbol_type): Declare constructor and other operations as
	inline.
	(yy_destroy_): Inline.
2008-11-09 19:51:28 +01:00
Akim Demaille
1f7d007bf6 Introduce a hierarchy for symbols.
* data/lalr1.cc (symbol_base_type, symbol_type): New.
	(data_type): Rename as...
	(stack_symbol_type): this.
	Derive from symbol_base_type.
	(yy_symbol_value_print_): Merge into...
	(yy_symbol_print_): this.
	Rename as...
	(yy_print_): this.
	(yydestruct_): Rename as...
	(yy_destroy_): this.
	(b4_symbols_actions, YY_SYMBOL_PRINT): Adjust.
	(parser::parse): yyla is now of symbol_type.
	Use its type member instead of yytoken.
2008-11-09 19:48:20 +01:00
Akim Demaille
bc0b0477e2 Rename data_type and stack_symbol_type.
* data/lalr1.cc (data_type): Rename as...
	(stack_symbol_type): this.
2008-11-09 19:45:14 +01:00
Akim Demaille
57295d14f9 Handle semantic value and location together.
* data/lalr1.cc (b4_symbol_actions): Bounce $$ and @$ to
	yydata.value and yydata.location.
	(yy_symbol_value_print_, yy_symbol_print_, yydestruct_)
	(YY_SYMBOL_PRINT): Now take semantic value and location as a
	single arg.
	Adjust all callers.
	(yydestruct_): New overload for a stack symbol.
2008-11-09 19:42:08 +01:00
Akim Demaille
e9b0834e18 Push a complete symbol, not connected parts.
* data/lalr1.cc (yypush_): Take a data_type&, not disconnected
	state, value and location.
	Adjust callers.
2008-11-09 19:39:09 +01:00
Akim Demaille
6082531abb Agregate yylval and yylloc.
* data/lalr1.cc (parser::yylval, parser::yylloc): Replace by...
	(parser::yyla): this.
2008-11-09 19:36:04 +01:00
Akim Demaille
33c195cc37 Rely on the state stack to display reduction traces.
To display rhs symbols before a reduction, we used information about the rule
reduced, which required the tables yyrhs and yyprhs.  Now use rely only on the
state stack to get the same information.

	* data/lalr1.cc (b4_rhs_data, b4_rhs_state): New.
	Use them.
	(parser::yyrhs_, parser::yyprhs_): Remove.
	(parser::yy_reduce_print_): Use the state stack.
2008-11-09 19:33:04 +01:00
Akim Demaille
e1f93869da Fuse yyval and yyloc into yylhs.
* data/lalr1.cc (b4_lhs_value, b4_lhs_location): Adjust to using
	yylhs.
	(parse): Replace yyval and yyloc with yylhs.value and
	yylhs.location.
	After a user action, compute yylhs.state earlier.
	(yyerrlab1): Do not play tricks with yylhs.location, rather, use a
	fresh error_token.
2008-11-09 19:29:38 +01:00
Akim Demaille
9380cfd008 Moving push traces into yypush_.
* data/lalr1.cc (yypush_): Now takes a optional trace message.
	Adjust all uses.
2008-11-07 21:38:54 +01:00
Akim Demaille
8901f32e4a The single-stack C++ parser is now the standard one.
* data/lalr1.cc: Rename as...
	* data/lalr1-split.cc: this.
	* data/lalr1-fusion.cc: Rename as...
	* data/lalr1.cc: this.
	* etc/bench.pl.in: Adjust.
2008-11-07 21:38:45 +01:00
Akim Demaille
8cdabf02ea Avoid empty-if warnings.
Reported by Quentin Hocquet.

	* data/lalr1-fusion.cc (YY_SYMBOL_PRINT, YY_REDUCE_PRINT)
	(YY_STACK_PRINT): Provide some contents even when !YYDEBUG.
2008-11-07 21:38:40 +01:00
Akim Demaille
a3d4c6fbb1 Destroy the variants that remain on the stack in case of error.
* data/lalr1-fusion.cc (yydestruct_): Invoke the variant's
	destructor.
	Display the value only if yymsg is nonnull.
	(yyreduce): Invoke yydestruct_ when popping lhs symbols.
2008-11-07 21:38:10 +01:00
Akim Demaille
2d32fc9fe2 Add "%define assert" to variants.
This is used to help the user catch cases where some value gets
ovewritten by a new one.  This should not happen, as this will
probably leak.

Unfortunately this uncovered a bug in the C++ parser itself: the
lookahead value was not destroyed between two calls to yylex.  For
instance if the previous lookahead was a std::string, and then an int,
then the value of the std::string was correctly taken (i.e., the
lookahead was now an empty string), but std::string structure itself
was not reclaimed.

This is now done in variant::build(other&) (which is used to take the
value of the lookahead): other is not only stolen from its value, it
is also destroyed.  This incurs a new performance penalty of a few
percent, and union becomes faster again.

	* data/lalr1-fusion.cc (variant::build(other&)): Destroy other.
	(b4_variant_if): New.
	(variant::built): New.
	Use it whereever the status of the variant changes.
	* etc/bench.pl.in: Check the penalty of %define assert.
2008-11-07 21:38:06 +01:00
Akim Demaille
639867b52f Use b4_copyright_years.
* data/yacc.c (b4_copyright_years): New.
	Fix its value according to the comments in the file.
	Use it and undefine it.
2008-11-04 21:43:55 +01:00
Akim Demaille
3c26260608 Formatting changes.
* data/lalr1-fusion.cc, src/parse-gram.y: here.
2008-11-04 21:43:51 +01:00
Akim Demaille
a2b93d5278 Formatting changes.
* data/lalr1-fusion.cc: here.
2008-11-04 21:43:46 +01:00
Akim Demaille
a0d4650a09 Remove spurious initial empty lines.
* data/glr.c, data/glr.cc, data/lalr1.cc, data/lalr1.java,
	* data/yacc.c: End the @output lines with an @.
2008-11-04 21:43:36 +01:00
Akim Demaille
4af4348a3f Don't memcpy C++ structures.
* data/lalr1-fusion.cc (b4_symbol_variant): Adjust additional
	arguments.
	(variant::build): New overload for
	copy-construction-that-destroys.
	(variant::swap): New.
	(parser::yypush_): Use it in variant mode.
2008-11-04 21:43:28 +01:00
Akim Demaille
006a030300 Sort methods.
* data/lalr1-fusion.cc (destroy): Use as() in its definition.
	Define it after as().
2008-11-04 21:21:56 +01:00
Akim Demaille
faef34664a Useless parens.
* data/lalr1-fusion.cc (b4_rhs_location): Remove useless parens.
2008-11-04 21:21:51 +01:00
Akim Demaille
2141adedf4 Issue missing synclines after user actions.
* data/c.m4 (b4_case): Issue synclines on the output file.
2008-11-04 21:21:47 +01:00
Akim Demaille
e426d17bc5 Remove trailing empty line.
* data/lalr1-fusion.cc: Don't add an empty line after the user's
	epilogue.
2008-11-04 21:21:43 +01:00
Akim Demaille
a9ce3f5413 Fix output of copyright years.
* data/bison.m4 (b4_copyright): Fix the indentation of the
	copyright year paragraph.
	Use b4_copyright_years when no years are given.
	* data/lalr1.cc, data/lalr1-fusion.cc, data/location.cc
	(b4_copyright_years): New.
	Use it.
2008-11-04 21:21:38 +01:00
Akim Demaille
59c544c268 Avoid the spurious initial empty line.
* data/lalr1-fusion.cc, data/location.cc: Put a trailing "@" at
	the end of @output request to suppress the empty line that
	results.
2008-11-04 21:21:34 +01:00
Akim Demaille
96b15448b9 Remove parser::rhs_number_type.
* data/lalr1-fusion.cc (rhs_number_type): No longer define it.
	(yyrhs_): Use b4_table_define.
2008-11-04 21:21:30 +01:00
Akim Demaille
f063317430 Fix iteration type.
* data/lalr1-fusion.cc: Use an int to iterate up to an int.
2008-11-04 21:21:25 +01:00
Akim Demaille
417b80040b Factor the declaration of the integer tables.
* data/lalr1-fusion.cc (b4_table_define): New.
	Use it.
2008-11-04 21:21:20 +01:00
Akim Demaille
1a5246c66f Fix indentation of tables in lalr1.cc
* data/lalr1-fusion.cc: Fix the indentation.
2008-11-03 22:01:27 +01:00
Akim Demaille
f8a95c9c12 Destroy the lhs symbols after reduction.
* data/lalr1-fusion.cc (parse): After the user action, when in
	variant mode, destroy the lhs symbols.
2008-11-03 22:01:22 +01:00
Akim Demaille
d7f4d82382 Simplify yysyntax_error_ use.
* data/lalr1-fusion.cc (yysyntax_error_): Always pass it the token
	type, but make it unnamed in the declaration when it is not used.
2008-11-03 22:01:18 +01:00
Akim Demaille
8b9c89fb65 Let yy::variant::build return an lvalue.
* data/lalr1-fusion.cc (variant::build): Return a reference to the
	object.
2008-11-03 22:01:14 +01:00
Akim Demaille
83243c24ba Define yy::variant only when needed.
* data/lalr1-fusion.cc (yy::variant): Define only if variants are
	used.
2008-11-03 22:01:10 +01:00
Akim Demaille
0e0ed236ab Fuse the three stacks into a single one.
In order to make it easy to perform benchmarks to ensure that there are no
performance loss, lalr1.cc is forked into lalr1-fusion.cc.  Eventually,
lalr1-fusion.cc will replace lalr1.cc.

Meanwhile, to make sure that lalr1-fusion.cc is correctly exercized by the
test suite, the user must install a symbolic link from lalr1.cc to it.

Instead of having three stacks (state, value, location), use a stack
of triples.  This considerably simplifies the code (and it will be
easier not to require locations as currently does the C++ parser),
and also gives a 10% speedup according to etc/bench (probably mainly since
memory allocation is done once instead of three times).

Another motivation is to make it easier to destruct properly
semantic values: now that they are bound to their state (hence
symbol type) it will be easier to call the appropriate destructor.

These changes should probably benefit the C parser too.

	* data/lalr1.cc: Copy as...  * data/lalr1-fusion.cc: this new
	file.
	(b4_rhs_value, b4_rhs_location): New definitions overriding those
	from c++.m4.
	(state_stack_type, semantic_stack_type, location_stack_type)
	(yystate_stack_, yysemantic_stack_, yylocation_stack_): Remove.
	(data_type, stack_type, yystack_): New.
	(YYLLOC_DEFAULT, yypush_): Adjust.
	(yyerror_range): Now based on data_type, not location_type.
2008-11-03 21:59:59 +01:00
Akim Demaille
7dedf26e55 Push the state, value, and location at the same time.
This is needed to prepare a forthcoming patch that fuses the three
stacks into one.

	* data/lalr1.cc (parser::yypush_): New.
	(parser::yynewstate): Change the semantics: instead of arriving to
	this label when value and location have been pushed, but yystate
	is to be pushed on the state stack, now the three of them must
	have been pushed before.  yystate still must be the new state.
	This allows to use yypush_ everywhere instead of individual
	handling of the stacks.
2008-11-03 21:51:02 +01:00
Akim Demaille
c4585f1e2d Prefer references to pointers.
* data/lalr1.cc (b4_symbol_actions): New, overrides the default C
	definition to use references instead of pointers.
	(yy_symbol_value_print_, yy_symbol_print_, yydestruct_):
	Take the value and location as references.
	Adjust callers.
2008-11-03 21:50:57 +01:00
Akim Demaille
56017c172b stack::size instead of stack::height.
* data/lalr1.cc (stack::height): Rename as...
	(stack::size): this.
	Fix the output type.
	Comment changes.
2008-11-03 21:50:53 +01:00