Commit Graph

55 Commits

Author SHA1 Message Date
Paul Eggert
a737b2163c Use more-consistent naming conventions for local vars. 2003-02-03 15:35:57 +00:00
Paul Eggert
1deb9bdcad src/scan-gram.l (<SC_BRACED_CODE>"}"): Append ";" only in braced code,
not in unions etc.
2002-12-31 02:26:51 +00:00
Paul Eggert
83adb046bf (<INITIAL,SC_AFTER_IDENTIFIER,SC_PRE_CODE>","):
Moved here from...
(<INITIAL>","): Here.  This causes stray "," to be treated
more uniformly.
2002-12-30 23:38:20 +00:00
Paul Eggert
255227393f (<SC_BRACED_CODE>"}"): Append ";" before the last brace in braced code
when not in Yacc mode, for compatibility with Bison 1.35.  This
resurrects the 2001-12-15 patch to src/reader.c.
2002-12-30 22:40:52 +00:00
Paul Eggert
624a35e20b (handle_dollar, handle_at): Now takes int
token_type, not braced_code code_kind.  All uses changed.
(SC_PRE_CODE): New state, for scanning after a keyword that
has (or usually has) an immediately-following braced code.
(token_type): New local var, to keep track of which token type
to return when scanning braced code.
(<INITIAL>"%destructor", <INITIAL>"%lex-param",
<INITIAL>"%parse-param", <INITIAL>"%printer,
<INITIAL>"%union"): Set token type and BEGIN SC_PRE_CODE
instead of returning a token type immediately.
(<INITIAL>"{"): Set token type.
(<SC_BRACED_CODE>"}"): Use it.
(handle_action_dollar, handle_action_at): Now returns bool
indicating success.  Fail if ! current_rule; this prevents a core dump.
(handle_symbol_code_dollar, handle_symbol_code_at):
Remove; merge body into caller.
(handle_dollar, handle_at): Complain in invalid contexts.
2002-12-24 07:46:49 +00:00
Paul Eggert
3b1e470c6d (<SC_ESCAPED_CHARACTER>"'"): Use unsigned char
local var instead of casting to unsigned char, to avoid casts.
2002-12-13 08:35:16 +00:00
Paul Eggert
223ff46e4c (<INITIAL>{int}): Use set_errno and get_errno instead of errno.
(<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>\\x[0-9abcdefABCDEF]+): Likewise.
(handle_action_dollar, handle_action_at): Likewise.
(obstack_for_string): Renamed from string_obstack.
2002-12-11 06:48:18 +00:00
Paul Eggert
3f2d73f157 Include "files.h".
(YY_USER_INIT): Initialize scanner_cursor instead
of *loc.
(STEP): Remove.  No longer needed, now that adjust_location does
the work.  All uses removed.
(scanner_cursor): New var.
(adjust_location): Renamed from extend_location.  It now sets
*loc and adjusts the scanner cursor.  All uses changed.
Don't bother testing for CR.
(handle_syncline): Remove location arg; now updates scanner cursor.
All callers changed.
(unexpected_end_of_file): Now accepts start boundary of token or
comment, not location.  All callers changed.  Update scanner cursor,
not the location.
(SC_AFTER_IDENTIFIER): New state.
(context_state): Renamed from c_context.  All uses changed.
(id_loc, code_start, token_start): New local vars.
(<INITIAL,SC_AFTER_IDENTIFIER>): New initial context.  Move all
processing of Yacc white space and equivalents here.
(<INITIAL>{id}): Save id_loc.  Begin state SC_AFTER_IDENTIFIER
instead of returning ID immediately, since we need to search for
a subsequent colon.
(<INITIAL>"'", "\""): Save token_start.
(<INITIAL>"%{", "{", "%%"): Save code_start.
(<SC_AFTER_IDENTIFIER>): New state, looking for a colon.
(<SC_YACC_COMMENT>, <SC_COMMENT>, <SC_LINE_COMMENT>):
BEGIN context_state at end, not INITIAL.
(<SC_ESCAPED_STRING>"\"", <SC_ESCAPED_CHARACTER>"'",
<SC_BRACED_CODE>"}", <SC_PROLOGUE>"%}", <SC_EPILOGUE><<EOF>>):
Return correct token start.
(<SC_BRACED_CODE,SC_PROLOGUE,SC_EPILOGUE>): Save start boundary when
the start of a character, string or multiline comment is found.
2002-12-07 06:14:27 +00:00
Paul Eggert
6c30d6413e (no_cr_read, extend_location): Move to epilogue,
and put only a forward declaration in the prologue.  This is for
consistency with the other scanner helper functions.
2002-12-01 02:37:56 +00:00
Paul Eggert
6b0d38ab2c [a-f] -> [abcdef], so that we don't assume the C locale. 2002-11-29 09:03:16 +00:00
Paul Eggert
763ed7a687 "," now elicits a warning, rather than being
a token; this is more compatible with byacc.
2002-11-29 08:44:40 +00:00
Paul Eggert
41141c568e (STEP): Renamed from YY_STEP. All uses changed.
(STRING_GROW): Renamed from YY_OBS_GROW.  All uses changed.
(STRING_FINISH): Renamed from YY_OBS_FINISH.  All uses changed.
(STRING_FREE): Renamed from YY_OBS_FREE.  All uses changed.
2002-11-27 18:34:14 +00:00
Paul Eggert
412f8a5975 Revamp regular expressions so that " and '
do not confuse xgettext.
2002-11-13 06:40:35 +00:00
Akim Demaille
7ec2d4cd39 * src/scan-gram.l, src/reader.h (scanner_last_string_free):
Restore.
* src/scan-gram.l (last_string): Is global to the file, not to
yylex.
* src/parse-gram.y (input): Don't append the epilogue here,
(epilogue.opt): do it here, and free the scanner's obstack.
* src/reader.c (epilogue_set): Rename as...
(epilogue_augment): this.
* data/c.m4 (b4_epilogue): Defaults to empty.
2002-11-12 08:26:38 +00:00
Akim Demaille
95612cfa60 * src/struniq.h, src/struniq.c (struniq_t): Is const.
(STRUNIQ_EQ, struniq_assert, struniq_assert_p): New.
Use struniq for symbols.
* src/symtab.h (symbol_t): The tag member is a struniq.
(symbol_type_set): Adjust.
* src/symtab.c (symbol_new): Takes a struniq.
(symbol_free): Don't free the tag member.
(hash_compare_symbol_t, hash_symbol_t): Rename as...
(hash_compare_symbol, hash_symbol): these.
Use the fact that tags as struniqs.
(symbol_get): Use struniq_new.
* src/symlist.h, src/symlist.c (symbol_list_n_type_name_get):
Returns a strniq.
* src/reader.h (merger_list, grammar_currentmerge_set): The name
and type members are struniqs.
* src/reader.c (get_merge_function)
(grammar_current_rule_merge_set): Adjust.
(TYPE, current_type): Are struniq.
Use struniq for file names.
* src/files.h, src/files.c (infile): Split into...
(grammar_file, current_file): these.
* src/scan-gram.c (YY_USER_INIT, handle_syncline): Adjust.
* src/reduce.c (reduce_print): Likewise.
* src/getargs.c (getargs): Likewise.
* src/complain.h, src/complain.c: Likewise.
* src/main.c (main): Call struniqs_new early enough to use it for
file names.
Don't free the input file name.
2002-11-12 08:05:59 +00:00
Akim Demaille
3e6656f9ab * src/symtab.c (symbol_free): Remove dead deactivated code:
type_name are properly removed.
Don't use XFREE to free items that cannot be NULL.
* src/struniq.h, src/struniq.c: New.
* src/main.c (main): Initialize/free struniqs.
* src/parse-gram.y (%union): Add astruniq member.
(yyprint): Adjust.
* src/scan-gram.l (<{tag}>): Return a struniq.
Free the obstack bit that used to store it.
* src/symtab.h (symbol_t): The 'type_name' member is a struniq.
2002-11-12 07:55:55 +00:00
Paul Eggert
ac060e78a3 (<SC_CHARACTER>): Don't worry about any backslash
escapes other than \\ and \'; this simplifies the code.
(<SC_STRING>): Likewise, for \\ and \".
(<SC_COMMENT,SC_LINE_COMMENT,SC_STRING,SC_CHARACTER,SC_BRACED_CODE,
SC_PROLOGUE,SC_EPILOGUE>): Escape $ and @, too.
Use new escapes @{ and @} for [ and ].
2002-11-12 07:27:04 +00:00
Paul Eggert
345532d70b (unexpected_end_of_file): Fix bug: columns were counted in the token
inserted at end of file.  Now takes location_t *, not location_t, so
that the location can be adjusted.  All uses changed.
2002-11-10 05:17:56 +00:00
Paul Eggert
a706a1cc03 Remove stack option. We no longer use the stack, since the stack was
never deeper than 1; instead, use the new auto var c_context to record
the stacked value.

Remove nounput option.  At an unexpected end of file, we now unput
the minimal input necessary to end cleanly; this simplifies the
code.

Avoid unbounded token sizes where this is easy.

(unexpected_end_of_file): New function.
Use it to systematize the error message on unexpected EOF.
(last-string): Now auto, not static.
(YY_OBS_FREE): Remove unnecessary do while (0) wrapper.
(scanner_last_string_free): Remove; not used.
(percent_percent_count): Move decl to just before use.
(SC_ESCAPED_CHARACTER): Return ID at unexpected end of file,
not the (never otherwised-used) CHARACTER.
2002-11-08 05:20:20 +00:00
Paul Eggert
8e6ef48342 (unexpected_end_of_file): New function.
Use it to systematize the error message on unexpected EOF.
2002-11-07 08:15:11 +00:00
Akim Demaille
900c5db537 * src/main.c (main): Free `infile'.
* src/scan-gram.l (handle_syncline): New.
Recognize `#line'.
* src/output.c (user_actions_output, symbol_destructors_output)
(symbol_printers_output): Use the location's file name, not
infile.
* src/reader.c (prologue_augment, epilogue_set): Likewise.
2002-11-06 08:08:46 +00:00
Paul Eggert
98f2caaa5f Use more accurate diagnostics, e.g.
"integer out of range" rather than "invalid value".
2002-11-06 07:01:06 +00:00
Paul Eggert
1a9e39f116 (braces_level): Now auto, not static.
Initialize to zero if the compiler is being picky.
(INITIAL): Clear braces_level instead of incrementing it.
(SC_BRACED_CODE): Treat <% and %> as { and } when inside C code,
as POSIX 1003.1-2001 requires.
2002-11-05 23:50:11 +00:00
Akim Demaille
29c017256a * src/scan-gram.l: When it starts with `%', complain about the
whole directive, not just that `invalid character: %'.
2002-11-05 21:20:14 +00:00
Akim Demaille
c4d720cdbb * src/location.h (LOCATION_PRINT): Use quotearg slot 3 to avoid
clashes.
* src/scan-gram.l: Use ['] instead of ['] to pacify
font-lock-mode.
Use complain_at.
Use quote, not quote_n since LOCATION_PRINT no longer uses the
slot 0.
2002-11-04 08:28:01 +00:00
Paul Eggert
d8d3f94a99 Revamp to fix POSIX incompatibilities, to count columns correctly, and
to check for invalid inputs.

Use mbsnwidth to count columns correctly.  Account for tabs, too.
Include mbswidth.h.
(YY_USER_ACTION): Invoke extend_location rather than LOCATION_COLUMNS.
(extend_location): New function.
(YY_LINES): Remove.

Handle CRLF in C code rather than in Lex code.
(YY_INPUT): New macro.
(no_cr_read): New function.

Scan UCNs, even though we don't fully handle them yet.
(convert_ucn_to_byte): New function.

Handle backslash-newline correctly in C code.
(SC_LINE_COMMENT, SC_YACC_COMMENT): New states.
(eols, blanks): Remove.  YY_USER_ACTION now counts newlines etc.;
all uses changed.
(tag, splice): New EREs.  Do not allow NUL or newline in tags.
Use {splice} wherever C allows backslash-newline.
YY_STEP after space, newline, vertical-tab.
("/*"): BEGIN SC_YACC_COMMENT, not yy_push_state (SC_COMMENT).

(letter, id): Don't assume ASCII; e.g., spell out a-z.

({int}, handle_action_dollar, handle_action_at): Check for integer
overflow.

(YY_STEP): Omit trailing semicolon, so that it's more like C.

(<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>): Allow \0 and \00
as well as \000.  Check for UCHAR_MAX, not 255.
Allow \x with an arbitrary positive number of digits, as in C.
Check for overflow here.
Allow \? and UCNs, for compatibility with C.

(handle_symbol_code_dollar): Use quote_n slot 1 to avoid collision
with quote slot used by complain_at.
2002-11-03 08:42:32 +00:00
Paul Eggert
d33cb3ae09 Remove all uses of PARAMS, since we now assume C89 or better. 2002-10-21 05:30:50 +00:00
Akim Demaille
ae7453f2ba Prototype support of %lex-param and %parse-param.
* src/parse-gram.y: Add the definition of the %lex-param and
%parse-param tokens, plus their rules.
Drop the `_' version of %glr-parser.
Add the "," token.
* src/scan-gram.l (INITIAL): Scan them.
* src/muscle_tab.c: Comment changes.
(muscle_insert, muscle_find): Rename `pair' as `probe'.
* src/muscle_tab.h (MUSCLE_INSERT_PREFIX): Remove unused.
(muscle_entry_s): The `value' member is no longer const.
Adjust all dependencies.
* src/muscle_tab.c (muscle_init): Adjust: use
MUSCLE_INSERT_STRING.
Initialize the obstack earlier.
* src/muscle_tab.h, src/muscle_tab.c (muscle_grow)
(muscle_pair_list_grow): New.
* data/c.m4 (b4_c_function_call, b4_c_args): New.
* data/yacc.c (YYLEX): Use b4_c_function_call to honor %lex-param.
* tests/calc.at: Use %locations, not --locations.
(AT_CHECK_CALC_GLR): Use %glr-parser, not %glr_parser.
2002-10-19 14:38:06 +00:00
Akim Demaille
473d0a7567 * src/getargs.h (trace_e): Add trace_scan, and trace_parse.
* src/getargs.c (trace_types, trace_args): Adjust.
* src/reader.c (grammar_current_rule_prec_set)
(grammar_current_rule_dprec_set, grammar_current_rule_merge_set):
Standardize error messages.
And s/@prec/%prec/!
(reader): Use trace_flag to enable scanner/parser debugging,
instead of an adhoc scheme.
* src/scan-gram.l: Remove trailing debugging code.
2002-10-17 17:47:33 +00:00
Paul Eggert
efcb44dd47 (rule_length): New static var.
Use it to keep track of the rule length in the scanner, since
we can't expect the parser to be in lock-step sync with the scanner.
(handle_action_dollar, handle_action_at): Use this var.
2002-10-13 08:38:39 +00:00
Akim Demaille
eb71459201 * tests/regression.at Characters Escapes): New.
* src/scan-gram.l (SC_ESCAPED_CHARACTER): Accept ' in strings and
characters.
Reported by Jan Nieuwenhuizen.
2002-10-11 11:23:19 +00:00
Paul Eggert
db2cc12fd0 Wrap strings in _() if they need translation.
Use strings rather than escapes when possible,
to minimize the number of warnings from xgettext.

(handle_action_dollar, handle_action_at): Don't use isdigit,
as it mishandles negative chars and it may not work as expected
outside the C locale.
2002-08-12 14:52:47 +00:00
Akim Demaille
5dde258a9e * src/scan-gram.l (id): Can start with an underscore. 2002-07-19 08:31:32 +00:00
Akim Demaille
1a715ef2fc * lib/quotearg.h: Protect against multiple inclusions.
* src/location.h (location_t): Add a `file' member.
(LOCATION_RESET, LOCATION_PRINT): Adjust.
* src/complain.c (warn_at, complain_at, fatal_at): Drop
`error_one_per_line' support.
2002-07-09 16:24:57 +00:00
Akim Demaille
a5d5099417 * src/complain.h, src/complain.c (warn, complain): Remove, unused.
* src/reader.c (lineno): Remove.
Adjust all dependencies.
(get_merge_function): Take a location and use complain_at.
* src/symtab.h, src/symtab.c (symbol_make_alias): Likewise.
* tests/regression.at (Invalid inputs, Mixing %token styles):
Adjust.
2002-07-09 15:54:39 +00:00
Akim Demaille
536545f3a4 * src/output.c (prepare_actions): Free tally' and width'.
(prepare_actions): Allocate and free `order'.
* src/symtab.c (symbols_free): Free `symbols'.
* src/scan-gram.l (scanner_free): Clear Flex's scanners memory.
* src/output.c (m4_invoke): Move to...
* src/scan-skel.l: here.
(<<EOF>>): Close yyout, and free its name.
2002-07-03 06:52:02 +00:00
Paul Eggert
e68d4575b3 (<SC_ESCAPED_CHARACTER>): Convert to unsigned char, so that negative
chars don't collide with $.
2002-07-01 08:36:37 +00:00
Akim Demaille
97650f4efc We spend a lot of time in quotearg, in particular when --verbose.
* src/symtab.c (symbol_get): Store a quoted version of the key.
(symbol_tag_get, symbol_tag_get_n, symbol_tag_print): Remove.
Adjust all callers.
2002-06-30 17:34:52 +00:00
Akim Demaille
39f4191608 * src/reader.c (gensym): Rename as...
* src/symtab.h, src/symtab.c (dummy_symbol_get): this.
(getsym): Rename as...
(symbol_get): this.
2002-06-30 17:27:57 +00:00
Paul Hilfinger
676385e29c Initial check-in introducing experimental GLR parsing. See entry in
ChangeLog dated 2002-06-27 from Paul Hilfinger for details.
2002-06-28 02:26:44 +00:00
Akim Demaille
e776192e4f * src/parse-gram.y (YYPRINT, yyprint): Don't mess with the parser
internals.
* src/reader.h, src/reader.c (grammar_current_rule_prec_set):
Takes a location.
* src/symtab.h, src/symtab.c (symbol_class_set)
(symbol_user_token_number_set): Likewise.
Adjust all callers.
Promote complain_at.
* tests/input.at (Type Clashes): Adjust.
2002-06-20 11:10:56 +00:00
Akim Demaille
366eea36d3 * src/symtab.h, src/symtab.c (symbol_t): printer and
printer_location are new members.
(symbol_printer_set): New.
* src/parse-gram.y (PERCENT_PRINTER): New token.
Handle its associated rule.
* src/scan-gram.l: Adjust.
(handle_destructor_at, handle_destructor_dollar): Rename as...
(handle_symbol_code_at, handle_symbol_code_dollar): these.
* src/output.c (symbol_printers_output): New.
(output_skeleton): Call it.
* data/bison.simple (yysymprint): New.  Cannot be named yyprint
since there are already many grammar files with a user `yyprint'.
Replace the calls to YYPRINT to calls to yysymprint.
* tests/calc.at: Adjust.
* tests/torture.at (AT_DATA_STACK_TORTURE): Remove YYPRINT: it was
taking advantage of parser very internal details (stack size!).
2002-06-20 09:08:37 +00:00
Akim Demaille
4f25ebb043 * src/scan-gram.l: Complete the scanner with the missing patterns
to pacify Flex.
Use `quote' and `symbol_tag_get' where appropriate.
2002-06-20 07:19:13 +00:00
Akim Demaille
f25bfb75aa Prepare @$ in %destructor, but currently don't bind it in the
skeleton, as %location use is not cleaned up yet.
* src/scan-gram.l (handle_dollar, handle_destructor_at)
(handle_action_at): New.
(handle_at, handle_action_dollar, handle_destructor_dollar): Take
a braced_code_t and a location as additional arguments.
(handle_destructor_dollar): Instead of requiring `b4_eval', just
unquote one when outputting `b4_dollar_dollar'.
Adjust callers.
* data/bison.simple (b4_eval): Remove.
(b4_symbol_destructor): Adjust.
* tests/input.at (Invalid @n): Adjust.
2002-06-19 08:22:49 +00:00
Akim Demaille
9280d3ef89 * data/m4sugar/m4sugar.m4 (m4_map): Recognize when the list of
arguments is really empty, not only equal to `[]'.
* src/symtab.h, src/symtab.c (symbol_t): `destructor' is a new
member.
(symbol_destructor_set): New.
* src/output.c (symbol_destructors_output): New.
* src/reader.h (brace_code_t, current_braced_code): New.
* src/scan-gram.l (BRACED_CODE): Use it to branch on...
(handle_dollar): Rename as...
(handle_action_dollar): this.
(handle_destructor_dollar): New.
* src/parse-gram.y (PERCENT_DESTRUCTOR): New.
(grammar_declaration): Use it.
* data/bison.simple (yystos): Is always defined.
(yydestructor): New.
* tests/actions.at (Destructors): New.
* tests/calc.at (_AT_CHECK_CALC_ERROR): Don't rely on egrep.
2002-06-17 08:43:12 +00:00
Akim Demaille
dafdc66ff0 * src/symlist.h, src/symlist.c (symbol_list_length): New.
* src/scan-gram.l (handle_dollar, handle_at): Compute the
rule_length only when needed.
* src/output.c (actions_output, token_definitions_output): Output
the full M4 block.
* src/symtab.c: Don't access directly to the symbol tag, use
symbol_tag_get.
* src/parse-gram.y: Use symbol_list_free.
2002-06-17 07:05:12 +00:00
Akim Demaille
56c4720342 * src/reader.h, src/reader.c (symbol_list, symbol_list_new)
(symbol_list_prepend, get_type_name): Move to...
* src/symlist.h, src/symlist.c (symbol_list_t, symbol_list_new)
(symbol_list_prepend, symbol_list_n_type_name_get): here.
Adjust all callers.
(symbol_list_free): New.
* src/scan-gram.l (handle_dollar): Takes a location.
* tests/input.at (Invalid $n): Adjust.
2002-06-17 07:04:49 +00:00
Akim Demaille
ee000ba4fc Let symbols have a location.
* src/symtab.h, src/symtab.c (symbol_t): Location is a new member.
(getsym): Adjust.
Adjust all callers.
* src/complain.h, src/complain.c (complain_at, fatal_at, warn_at):
Use location_t, not int.
* src/symtab.c (symbol_check_defined): Take advantage of the
location.
* tests/regression.at (Invalid inputs): Adjust.
2002-06-15 18:21:46 +00:00
Akim Demaille
8efe435c05 * src/parse-gram.y (YYLLOC_DEFAULT, current_lhs_location): New.
(input): Don't try to initialize yylloc here, do it in the
scanner.
* src/scan-gram.l (YY_USER_INIT): Initialize yylloc.
* src/gram.h (rule_t): Change line and action_line into location
and action_location, of location_t type.
Adjust all dependencies.
* src/location.h, src/location.c (empty_location): New.
* src/reader.h, src/reader.c (grammar_start_symbol_set)
(grammar_symbol_append, grammar_rule_begin, grammar_rule_end)
(grammar_current_rule_symbol_append)
(grammar_current_rule_action_append): Expect a location as argument.
* src/reader.c (grammar_midrule_action): Adjust to attach an
action's location as dummy symbol location.
* src/symtab.h, src/symtab.c (startsymbol_location): New.
* tests/regression.at (Web2c Report, Rule Line Numbers): Adjust
the line numbers.
2002-06-15 18:21:11 +00:00
Akim Demaille
75d1fe1611 * src/scan-gram.l (SC_BRACED_CODE): Don't use `<.*>', it is too
eager.
* tests/actions.at (Exotic Dollars): New.
2002-06-12 15:14:59 +00:00