Commit Graph

34 Commits

Author SHA1 Message Date
Paul Eggert
98f2caaa5f Use more accurate diagnostics, e.g.
"integer out of range" rather than "invalid value".
2002-11-06 07:01:06 +00:00
Paul Eggert
1a9e39f116 (braces_level): Now auto, not static.
Initialize to zero if the compiler is being picky.
(INITIAL): Clear braces_level instead of incrementing it.
(SC_BRACED_CODE): Treat <% and %> as { and } when inside C code,
as POSIX 1003.1-2001 requires.
2002-11-05 23:50:11 +00:00
Akim Demaille
29c017256a * src/scan-gram.l: When it starts with `%', complain about the
whole directive, not just that `invalid character: %'.
2002-11-05 21:20:14 +00:00
Akim Demaille
c4d720cdbb * src/location.h (LOCATION_PRINT): Use quotearg slot 3 to avoid
clashes.
* src/scan-gram.l: Use ['] instead of ['] to pacify
font-lock-mode.
Use complain_at.
Use quote, not quote_n since LOCATION_PRINT no longer uses the
slot 0.
2002-11-04 08:28:01 +00:00
Paul Eggert
d8d3f94a99 Revamp to fix POSIX incompatibilities, to count columns correctly, and
to check for invalid inputs.

Use mbsnwidth to count columns correctly.  Account for tabs, too.
Include mbswidth.h.
(YY_USER_ACTION): Invoke extend_location rather than LOCATION_COLUMNS.
(extend_location): New function.
(YY_LINES): Remove.

Handle CRLF in C code rather than in Lex code.
(YY_INPUT): New macro.
(no_cr_read): New function.

Scan UCNs, even though we don't fully handle them yet.
(convert_ucn_to_byte): New function.

Handle backslash-newline correctly in C code.
(SC_LINE_COMMENT, SC_YACC_COMMENT): New states.
(eols, blanks): Remove.  YY_USER_ACTION now counts newlines etc.;
all uses changed.
(tag, splice): New EREs.  Do not allow NUL or newline in tags.
Use {splice} wherever C allows backslash-newline.
YY_STEP after space, newline, vertical-tab.
("/*"): BEGIN SC_YACC_COMMENT, not yy_push_state (SC_COMMENT).

(letter, id): Don't assume ASCII; e.g., spell out a-z.

({int}, handle_action_dollar, handle_action_at): Check for integer
overflow.

(YY_STEP): Omit trailing semicolon, so that it's more like C.

(<SC_ESCAPED_STRING,SC_ESCAPED_CHARACTER>): Allow \0 and \00
as well as \000.  Check for UCHAR_MAX, not 255.
Allow \x with an arbitrary positive number of digits, as in C.
Check for overflow here.
Allow \? and UCNs, for compatibility with C.

(handle_symbol_code_dollar): Use quote_n slot 1 to avoid collision
with quote slot used by complain_at.
2002-11-03 08:42:32 +00:00
Paul Eggert
d33cb3ae09 Remove all uses of PARAMS, since we now assume C89 or better. 2002-10-21 05:30:50 +00:00
Akim Demaille
ae7453f2ba Prototype support of %lex-param and %parse-param.
* src/parse-gram.y: Add the definition of the %lex-param and
%parse-param tokens, plus their rules.
Drop the `_' version of %glr-parser.
Add the "," token.
* src/scan-gram.l (INITIAL): Scan them.
* src/muscle_tab.c: Comment changes.
(muscle_insert, muscle_find): Rename `pair' as `probe'.
* src/muscle_tab.h (MUSCLE_INSERT_PREFIX): Remove unused.
(muscle_entry_s): The `value' member is no longer const.
Adjust all dependencies.
* src/muscle_tab.c (muscle_init): Adjust: use
MUSCLE_INSERT_STRING.
Initialize the obstack earlier.
* src/muscle_tab.h, src/muscle_tab.c (muscle_grow)
(muscle_pair_list_grow): New.
* data/c.m4 (b4_c_function_call, b4_c_args): New.
* data/yacc.c (YYLEX): Use b4_c_function_call to honor %lex-param.
* tests/calc.at: Use %locations, not --locations.
(AT_CHECK_CALC_GLR): Use %glr-parser, not %glr_parser.
2002-10-19 14:38:06 +00:00
Akim Demaille
473d0a7567 * src/getargs.h (trace_e): Add trace_scan, and trace_parse.
* src/getargs.c (trace_types, trace_args): Adjust.
* src/reader.c (grammar_current_rule_prec_set)
(grammar_current_rule_dprec_set, grammar_current_rule_merge_set):
Standardize error messages.
And s/@prec/%prec/!
(reader): Use trace_flag to enable scanner/parser debugging,
instead of an adhoc scheme.
* src/scan-gram.l: Remove trailing debugging code.
2002-10-17 17:47:33 +00:00
Paul Eggert
efcb44dd47 (rule_length): New static var.
Use it to keep track of the rule length in the scanner, since
we can't expect the parser to be in lock-step sync with the scanner.
(handle_action_dollar, handle_action_at): Use this var.
2002-10-13 08:38:39 +00:00
Akim Demaille
eb71459201 * tests/regression.at Characters Escapes): New.
* src/scan-gram.l (SC_ESCAPED_CHARACTER): Accept ' in strings and
characters.
Reported by Jan Nieuwenhuizen.
2002-10-11 11:23:19 +00:00
Paul Eggert
db2cc12fd0 Wrap strings in _() if they need translation.
Use strings rather than escapes when possible,
to minimize the number of warnings from xgettext.

(handle_action_dollar, handle_action_at): Don't use isdigit,
as it mishandles negative chars and it may not work as expected
outside the C locale.
2002-08-12 14:52:47 +00:00
Akim Demaille
5dde258a9e * src/scan-gram.l (id): Can start with an underscore. 2002-07-19 08:31:32 +00:00
Akim Demaille
1a715ef2fc * lib/quotearg.h: Protect against multiple inclusions.
* src/location.h (location_t): Add a `file' member.
(LOCATION_RESET, LOCATION_PRINT): Adjust.
* src/complain.c (warn_at, complain_at, fatal_at): Drop
`error_one_per_line' support.
2002-07-09 16:24:57 +00:00
Akim Demaille
a5d5099417 * src/complain.h, src/complain.c (warn, complain): Remove, unused.
* src/reader.c (lineno): Remove.
Adjust all dependencies.
(get_merge_function): Take a location and use complain_at.
* src/symtab.h, src/symtab.c (symbol_make_alias): Likewise.
* tests/regression.at (Invalid inputs, Mixing %token styles):
Adjust.
2002-07-09 15:54:39 +00:00
Akim Demaille
536545f3a4 * src/output.c (prepare_actions): Free tally' and width'.
(prepare_actions): Allocate and free `order'.
* src/symtab.c (symbols_free): Free `symbols'.
* src/scan-gram.l (scanner_free): Clear Flex's scanners memory.
* src/output.c (m4_invoke): Move to...
* src/scan-skel.l: here.
(<<EOF>>): Close yyout, and free its name.
2002-07-03 06:52:02 +00:00
Paul Eggert
e68d4575b3 (<SC_ESCAPED_CHARACTER>): Convert to unsigned char, so that negative
chars don't collide with $.
2002-07-01 08:36:37 +00:00
Akim Demaille
97650f4efc We spend a lot of time in quotearg, in particular when --verbose.
* src/symtab.c (symbol_get): Store a quoted version of the key.
(symbol_tag_get, symbol_tag_get_n, symbol_tag_print): Remove.
Adjust all callers.
2002-06-30 17:34:52 +00:00
Akim Demaille
39f4191608 * src/reader.c (gensym): Rename as...
* src/symtab.h, src/symtab.c (dummy_symbol_get): this.
(getsym): Rename as...
(symbol_get): this.
2002-06-30 17:27:57 +00:00
Paul Hilfinger
676385e29c Initial check-in introducing experimental GLR parsing. See entry in
ChangeLog dated 2002-06-27 from Paul Hilfinger for details.
2002-06-28 02:26:44 +00:00
Akim Demaille
e776192e4f * src/parse-gram.y (YYPRINT, yyprint): Don't mess with the parser
internals.
* src/reader.h, src/reader.c (grammar_current_rule_prec_set):
Takes a location.
* src/symtab.h, src/symtab.c (symbol_class_set)
(symbol_user_token_number_set): Likewise.
Adjust all callers.
Promote complain_at.
* tests/input.at (Type Clashes): Adjust.
2002-06-20 11:10:56 +00:00
Akim Demaille
366eea36d3 * src/symtab.h, src/symtab.c (symbol_t): printer and
printer_location are new members.
(symbol_printer_set): New.
* src/parse-gram.y (PERCENT_PRINTER): New token.
Handle its associated rule.
* src/scan-gram.l: Adjust.
(handle_destructor_at, handle_destructor_dollar): Rename as...
(handle_symbol_code_at, handle_symbol_code_dollar): these.
* src/output.c (symbol_printers_output): New.
(output_skeleton): Call it.
* data/bison.simple (yysymprint): New.  Cannot be named yyprint
since there are already many grammar files with a user `yyprint'.
Replace the calls to YYPRINT to calls to yysymprint.
* tests/calc.at: Adjust.
* tests/torture.at (AT_DATA_STACK_TORTURE): Remove YYPRINT: it was
taking advantage of parser very internal details (stack size!).
2002-06-20 09:08:37 +00:00
Akim Demaille
4f25ebb043 * src/scan-gram.l: Complete the scanner with the missing patterns
to pacify Flex.
Use `quote' and `symbol_tag_get' where appropriate.
2002-06-20 07:19:13 +00:00
Akim Demaille
f25bfb75aa Prepare @$ in %destructor, but currently don't bind it in the
skeleton, as %location use is not cleaned up yet.
* src/scan-gram.l (handle_dollar, handle_destructor_at)
(handle_action_at): New.
(handle_at, handle_action_dollar, handle_destructor_dollar): Take
a braced_code_t and a location as additional arguments.
(handle_destructor_dollar): Instead of requiring `b4_eval', just
unquote one when outputting `b4_dollar_dollar'.
Adjust callers.
* data/bison.simple (b4_eval): Remove.
(b4_symbol_destructor): Adjust.
* tests/input.at (Invalid @n): Adjust.
2002-06-19 08:22:49 +00:00
Akim Demaille
9280d3ef89 * data/m4sugar/m4sugar.m4 (m4_map): Recognize when the list of
arguments is really empty, not only equal to `[]'.
* src/symtab.h, src/symtab.c (symbol_t): `destructor' is a new
member.
(symbol_destructor_set): New.
* src/output.c (symbol_destructors_output): New.
* src/reader.h (brace_code_t, current_braced_code): New.
* src/scan-gram.l (BRACED_CODE): Use it to branch on...
(handle_dollar): Rename as...
(handle_action_dollar): this.
(handle_destructor_dollar): New.
* src/parse-gram.y (PERCENT_DESTRUCTOR): New.
(grammar_declaration): Use it.
* data/bison.simple (yystos): Is always defined.
(yydestructor): New.
* tests/actions.at (Destructors): New.
* tests/calc.at (_AT_CHECK_CALC_ERROR): Don't rely on egrep.
2002-06-17 08:43:12 +00:00
Akim Demaille
dafdc66ff0 * src/symlist.h, src/symlist.c (symbol_list_length): New.
* src/scan-gram.l (handle_dollar, handle_at): Compute the
rule_length only when needed.
* src/output.c (actions_output, token_definitions_output): Output
the full M4 block.
* src/symtab.c: Don't access directly to the symbol tag, use
symbol_tag_get.
* src/parse-gram.y: Use symbol_list_free.
2002-06-17 07:05:12 +00:00
Akim Demaille
56c4720342 * src/reader.h, src/reader.c (symbol_list, symbol_list_new)
(symbol_list_prepend, get_type_name): Move to...
* src/symlist.h, src/symlist.c (symbol_list_t, symbol_list_new)
(symbol_list_prepend, symbol_list_n_type_name_get): here.
Adjust all callers.
(symbol_list_free): New.
* src/scan-gram.l (handle_dollar): Takes a location.
* tests/input.at (Invalid $n): Adjust.
2002-06-17 07:04:49 +00:00
Akim Demaille
ee000ba4fc Let symbols have a location.
* src/symtab.h, src/symtab.c (symbol_t): Location is a new member.
(getsym): Adjust.
Adjust all callers.
* src/complain.h, src/complain.c (complain_at, fatal_at, warn_at):
Use location_t, not int.
* src/symtab.c (symbol_check_defined): Take advantage of the
location.
* tests/regression.at (Invalid inputs): Adjust.
2002-06-15 18:21:46 +00:00
Akim Demaille
8efe435c05 * src/parse-gram.y (YYLLOC_DEFAULT, current_lhs_location): New.
(input): Don't try to initialize yylloc here, do it in the
scanner.
* src/scan-gram.l (YY_USER_INIT): Initialize yylloc.
* src/gram.h (rule_t): Change line and action_line into location
and action_location, of location_t type.
Adjust all dependencies.
* src/location.h, src/location.c (empty_location): New.
* src/reader.h, src/reader.c (grammar_start_symbol_set)
(grammar_symbol_append, grammar_rule_begin, grammar_rule_end)
(grammar_current_rule_symbol_append)
(grammar_current_rule_action_append): Expect a location as argument.
* src/reader.c (grammar_midrule_action): Adjust to attach an
action's location as dummy symbol location.
* src/symtab.h, src/symtab.c (startsymbol_location): New.
* tests/regression.at (Web2c Report, Rule Line Numbers): Adjust
the line numbers.
2002-06-15 18:21:11 +00:00
Akim Demaille
75d1fe1611 * src/scan-gram.l (SC_BRACED_CODE): Don't use `<.*>', it is too
eager.
* tests/actions.at (Exotic Dollars): New.
2002-06-12 15:14:59 +00:00
Akim Demaille
6c35d22c39 * src/scan-gram.l (SC_PROLOGUE): Don't eat characters amongst
['"/] too eagerly.
* tests/input.at (Torturing the Scanner): New.
2002-06-12 12:50:22 +00:00
Akim Demaille
1d6412adeb * src/scan-gram.l (YY_OBS_INIT): Remove, replace with...
[SC_COMMENT,SC_STRING,SC_CHARACTER,SC_BRACED_CODE,SC_PROLOGUE]
[SC_EPILOGUE]: Output the quadrigraphs only when not in a comment.
* src/reader.h, src/scan-gram.l (scanner_initialize): this.
* src/reader.c (reader): Use it.
2002-06-11 21:46:16 +00:00
Akim Demaille
4cdb01db9b * src/scan-gram.l (YY_OBS_FINISH): Don't set yylval.
Adjust all callers.
(scanner_last_string_free): New.
2002-06-11 21:45:49 +00:00
Akim Demaille
44995b2e39 * src/scan-gram.l (YY_INIT, YY_GROW, YY_FINISH): Rename as...
(YY_OBS_INIT, YY_OBS_GROW, YY_OBS_FINISH): these.
(last_string, YY_OBS_FREE): New.
Use them when returning an ID.
2002-06-11 21:43:18 +00:00
Akim Demaille
e9955c8373 Have Bison grammars parsed by a Bison grammar.
* src/reader.c, src/reader.h (prologue_augment): New.
* src/reader.c (copy_definition): Remove.
* src/reader.h, src/reader.c (gram_start_symbol_set, prologue_augment)
(grammar_symbol_append, grammar_rule_begin, grammar_midrule_action)
(grammar_current_rule_prec_set, grammar_current_rule_check)
(grammar_current_rule_symbol_append)
(grammar_current_rule_action_append): Export.
* src/parse-gram.y (symbol_list_new, symbol_list_symbol_append_
(symbol_list_action_append): Remove.
Hook the routines from reader.
* src/scan-gram.l: In INITIAL, characters and strings are tokens.
* src/system.h (ATTRIBUTE_NORETURN, ATTRIBUTE_UNUSED): Now.
* src/reader.c (read_declarations): Remove, unused.
* src/parse-gram.y: Handle the epilogue.
* src/reader.h, src/reader.c (gram_start_symbol_set): Rename as...
(grammar_start_symbol_set): this.
* src/scan-gram.l: Be sure to ``use'' yycontrol to keep GCC quiet.
* src/reader.c (readgram): Remove, unused.
(reader): Adjust to insert eoftoken and axiom where appropriate.
* src/reader.c (copy_dollar): Replace with...
* src/scan-gram.h (handle_dollar): this.
* src/parse-gram.y: Remove `%thong'.
* src/reader.c (copy_at): Replace with...
* src/scan-gram.h (handle_at): this.
* src/complain.h, src/complain.c (warn_at, complain_at, fatal_at):
New.
* src/scan-gram.l (YY_LINES): Keep lineno synchronized for the
time being.
* src/reader.h, src/reader.c (grammar_rule_end): New.
* src/parse.y (current_type, current_class): New.
Implement `%nterm', `%token' support.
Merge `%term' into `%token'.
(string_as_id): New.
* src/symtab.h, src/symtab.c (symbol_make_alias): Don't pass the
type name.
* src/parse-gram.y: Be sure to handle properly the beginning of
rules.
* src/parse-gram.y: Handle %type.
* src/reader.c (grammar_rule_end): Call grammar_current_rule_check.
* src/parse-gram.y: More directives support.
* src/options.c: No longer handle source directives.
* src/parse-gram.y: Fix %output.
* src/parse-gram.y: Handle %union.
Use the prologue locations.
* src/reader.c (parse_union_decl): Remove.
* src/reader.h, src/reader.c (epilogue_set): New.
* src/parse-gram.y: Use it.
* data/bison.simple, data/bison.c++: b4_stype is now either not
defined, then default to int, or to the contents of %union,
without `union' itself.
Adjust.
* src/muscle_tab.c (muscle_init): Don't predefine `stype'.
* src/output.c (actions_output): Don't output braces, as they are
already handled by the scanner.
* src/scan-gram.l (SC_CHARACTER): Set the user_token_number of
characters to themselves.
* tests/reduce.at (Reduced Automaton): End the grammars with %% so
that the epilogue has a proper #line.
* src/parse-gram.y: Handle precedence/associativity.
* src/symtab.c (symbol_precedence_set): Requires the symbol to be
a terminal.
* src/scan-gram.l (SC_BRACED_CODE): Catch strings and characters.
* tests/calc.at: Do not use `%token "foo"' as it makes not sense
at all to define terminals that cannot be emitted.
* src/scan-gram.l: Escape M4 characters.
* src/scan-gram.l: Working properly with escapes in user
strings/characters.
* tests/torture.at (AT_DATA_TRIANGULAR_GRAMMAR)
(AT_DATA_HORIZONTAL_GRAMMAR): Respect the `%token ID NUM STRING'
grammar.
Use more modest sizes, as for the time being the parser does not
release memory, and therefore the process swallows a huge amount
of memory.
* tests/torture.at (AT_DATA_LOOKAHEADS_GRAMMAR): Adjust to the
stricter %token grammar.
* src/symtab.h (associativity): Add `undef_assoc'.
(symbol_precedence_set): Do nothing when passed an undef_assoc.
* src/symtab.c (symbol_check_alias_consistence): Adjust.
* tests/regression.at (Invalid %directive): Remove, as it is now
meaningless.
(Invalid inputs): Adjust to the new error messages.
(Token definitions): The new grammar doesn't allow too many
eccentricities.
* src/lex.h, src/lex.c: Remove.
* src/reader.c (lastprec, skip_to_char, read_signed_integer)
(copy_character, copy_string2, copy_string, copy_identifier)
(copy_comment, parse_token_decl, parse_type_decl, parse_assoc_decl)
(parse_muscle_decl, parse_dquoted_param, parse_skel_decl)
(parse_action): Remove.
* po/POTFILES.in: Adjust.
2002-06-11 20:16:05 +00:00