Commit Graph

6038 Commits

Author SHA1 Message Date
Akim Demaille
1099b8dc26 symbols: document the overhaul of symbol declarations
* doc/bison.texi (Symbol Decls): New.
2018-12-17 05:57:17 +01:00
Akim Demaille
8e5b1f40ae symbols: check more invalid declarations
* tests/input.at (Invalid %nterm uses): Rename as...
(Invalid symbol declarations): this.
Extend.
2018-12-16 12:27:28 +01:00
Akim Demaille
25b9eada8c symbols: check the previous commit
* tests/input.at (Symbol declarations): New.
2018-12-16 12:27:28 +01:00
Akim Demaille
dbe499e936 regen 2018-12-16 12:27:28 +01:00
Akim Demaille
1d5956f87f symbols: clean up their parsing
Prompted by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html

We have four classes of directives that declare symbols: %nterm,
%type, %token, and the family of %left etc.  Currently not all of them
support the possibility to have several type tags (`<type>`), and not
all of them support the fact of not having any type tag at all
(%type).  Let's unify this.

- %type
  POSIX Yacc specifies that %type is for nonterminals only.  However,
  some Bison users want to use it for both tokens and nterms
  (actually, Bison's own grammar does this in several places, e.g.,
  CHAR).  So it should accept char/string literals.

  As a consequence cannot be used to declare tokens with their alias:
  `%type foo "foo"` would be ambiguous (are we defining foo = "foo",
  or are these two different symbols?)

  POSIX specifies that it is OK to use %type without a type tag.  I'm
  not sure what it means, but we support it.

- %token
  Accept token declarations with number and string literal:
  (ID|CHAR) NUM? STRING?.

- %left, etc.
  They cannot be the same as %token, because we accept to declare the
  symbol with %token, and to then qualify its precedence with %left.
  Then `%left foo "foo"` would also be ambiguous: foo="foo", or two
  symbols.

  They cannot be simply a list of identifiers, but POSIX Yacc says we
  can declare token numbers here.  I personally think this is a bad
  idea, precedence management is tricky in itself and should not be
  cluttered with token declaration issues.

  We used to accept declaring a token number on a string literal here
  (e.g., `%left "token" 1`).  This is abnormal.  Either the feature is
  useful, and then it should be supported in %token, or it's useless
  and we should not support it in corner cases.

- %nterm
  Obviously cannot accept tokens, nor char/string literals.  Does not
  exist in POSIX Yacc, but since %type also works for terminals, it is
  a nice option to have.

* src/parse-gram.y: Avoid relying on side effects.  For instance, get
rid of current_type, rather, build the list of symbols and iterate
over it to assign the type.
It's not always possible/convenient.  For instance, we still use
current_class.
Prefer "decl" to "def", since in the rest of the implementation we
actually "declare" symbols, we don't "define" them.
(token_decls, token_decls_for_prec, symbol_decls, nterm_decls): New.
Use them for %token, %left, %type and %nterm.
* src/symlist.h, src/symlist.c (symbol_list_type_set): New.
* tests/regression.at b/tests/regression.at
(Token number in precedence declaration): We no longer accept
to give a number to string literals.
2018-12-16 12:27:28 +01:00
Akim Demaille
fdceb6330f symbols: set tag_seen when assigning a type to symbols
* src/reader.h, src/reader.c (tag_seen): Move to...
* src/symtab.h, src/symtab.c: here.
(symbol_type_set): Set it to true.
* src/parse-gram.y: Don't.
2018-12-15 17:41:25 +01:00
Akim Demaille
bc31dee0f7 tests: isolate test about Yacc warnings
* tests/input.at (Yacc warnings): New.
(AT_CHECK_UNUSED_VALUES): Remove checks about yacc.
2018-12-14 05:10:31 +01:00
Akim Demaille
465a47d46b parser: warn about string literals in Yacc mode
* src/scan-gram.l (scan_integer): Warn.
* tests/input.at (Yacc warnings on symbols): Check.
2018-12-14 05:10:31 +01:00
Akim Demaille
953a95695a parser: warn about hexadecimal token numbers in Yacc mode
* src/scan-gram.l (scan_integer): Warn.
* tests/input.at (Yacc warnings on symbols): Check.
2018-12-14 05:10:31 +01:00
Akim Demaille
aadf6c0bf3 parser: reprecate %nterm back
After having spent quite some time on cleaning the handling of symbol
declarations in the grammar files, I believe we should keep it.

It looks like it's a duplicate of %type, but it is not.  While POSIX
Yacc requires %type to apply only to nonterminal symbols, it appears
that both byacc and bison accept it for tokens too.  And some
experienced users do actually expect this feature to group
symbols (terminal or not) by type ("On the other hand, it is generally
more useful IMHO to group terminals and non-terminals with the same
type tag together",
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html).
Even Bison's own parser does this today (see CHAR).

Basically reverts 7928c3e6fb.

* src/scan-gram.l (%nterm): Dedeprecate, but issue a Wyacc warning.
* tests/input.at: Adjust expectations.
(Yacc warnings  on symbols): New.
* src/symtab.c (symbol_class_set): Fix error introduced in
20b0746793.
2018-12-14 05:10:18 +01:00
Eduard Staniloiu
dbb855895f CI: add dmd support
* .travis.yml: here.
2018-12-11 07:06:12 +01:00
Akim Demaille
d68f05d75c style: s/non-terminal/nonterminal/
I personally prefer 'non terminal', or 'non-terminal', but
'nonterminal' is the common spelling.

* data/glr.c, src/parse-gram.y, src/symtab.c, src/symtab.h,
* tests/input.at, doc/refcard.tex: here.
2018-12-11 06:55:41 +01:00
Akim Demaille
b05aa7be2e style: rename error functions for clarity
* src/symtab.c (symbol_redeclaration, semantic_type_redeclaration)
(user_token_number_redeclaration):
Rename as...
(complain_symbol_redeclared, complain_semantic_type_redeclared)
(complain_user_token_number_redeclared):
this.
2018-12-11 06:55:35 +01:00
Akim Demaille
20b0746793 parser: improve the error message for symbol class redefinition
Currently our error messages include both "symbol redeclared" and
"symbol redefined", and they mean something different.  This is
obscure, let's make this clearer.

I think the idea between 'definition' vs. 'declaration' is that in the
case of the nonterminals, the actual definition is its set of rules,
so %nterm would be about declaration.  The case of %token is less
clear.

* src/symtab.c (complain_class_redefined): New.
(symbol_class_set): Use it.
Simplify the logic of this function to clearly skip its body when the
preconditions are not met.
* tests/input.at (Symbol class redefinition): New.
2018-12-11 06:53:25 +01:00
Akim Demaille
afdefecab6 examples: simplify computation of yydebug
* examples/c/lexcalc/parse.y: here.
2018-12-11 06:53:25 +01:00
Akim Demaille
81dbd0d82e C++: support variadic emplace
Suggested by Askar Safin.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00006.html

* data/variant.hh: Implement.
* tests/types.at: Check.
* doc/bison.texi: Document.
2018-12-10 17:50:12 +01:00
Akim Demaille
d657da9fb4 examples: add a simple Flex+Bison example in C
Suggested by Askar Safin.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00003.html

* examples/c/lexcalc/Makefile, examples/c/lexcalc/README.md,
* examples/c/lexcalc/lexcalc.test, examples/c/lexcalc/local.mk,
* examples/c/lexcalc/parse.y, examples/c/lexcalc/scan.l:
New.
2018-12-09 15:30:25 +01:00
Akim Demaille
4cbdcaa572 regen 2018-12-09 13:55:05 +01:00
Akim Demaille
85d303b713 examples: sort them per language and complete them
Convert some of the READMEs to Markdown, which is now more common, and
nicely displayed in some git hosting services.

Add missing READMEs and Makefiles.  Generate XML, HTML and Dot files.  Be
sure to ship the test files.  Complete CLEANFILES to remove all generated
files.

* examples/calc++: Move into...
* examples/c++: here.
* examples/mfcalc, examples/rpcalc: Move into...
* examples/c: here.

* examples/README.md, examples/c++/calc++/Makefile, examples/c/local.mk,
* examples/c/mfcalc/Makefile, examples/c/rpcalc/Makefile,
* examples/d/README.md, examples/java/README.md:
New files.

* examples/test (medir): Be robust to deeper directory nesting.
2018-12-09 13:55:05 +01:00
Akim Demaille
1e6a68858a regen 2018-12-09 12:50:53 +01:00
Akim Demaille
17730b0287 parser: minor refactoring
* src/parse-gram.y (symbol.prec): Reuse int.opt.
2018-12-09 12:50:53 +01:00
Akim Demaille
157f12c483 parser: move checks inside the called functions
Revamping the handling of the symbols is the grammar is much more
delicate than I anticipated.  Let's first move things around for
clarity.

* src/symtab.c (symbol_make_alias): Don't accept to alias
non-terminals.
(symbol_user_token_number_set): Don't accept user token numbers
for non-terminals.
Don't do anything in case of redefinition, instead of trying to
update.  The flow is eaier to follow this way.
2018-12-09 12:50:53 +01:00
Akim Demaille
e1a843cc69 d: fix double definition of YYSemanticType
* data/lalr1.d: When moving to b4_user_union_members, it also defines
b4_tag_seen_flag, so we had two definitions.
2018-12-08 08:05:00 +01:00
Akim Demaille
fe97793659 gnulib: update 2018-12-08 07:50:43 +01:00
Akim Demaille
401afe5cc2 parser: fix incorrect condition to raise a syntax error
* src/parse-gram.y (symbol_def): Fix test.
2018-12-06 17:50:54 +01:00
Akim Demaille
10e3ccac05 d: fix use of b4_union_members
* data/lalr1.d: Use b4_user_union_members instead.
2018-12-06 06:27:33 +01:00
Akim Demaille
3d5059f431 style: comment changes
* data/variant.hh: here.
2018-12-06 06:27:33 +01:00
Akim Demaille
9a5c12f160 java, d: add a Makefile for the example
* examples/java/Makefile, examples/d/Makefile: New.
2018-12-06 05:19:09 +01:00
Akim Demaille
156140dfc3 style: scope reduction in ielr.c
* src/ielr.c: here.
2018-12-05 07:12:12 +01:00
Akim Demaille
4176584062 style: scope reduction in lalr.c
* src/lalr.c: here.
2018-12-05 06:49:06 +01:00
Akim Demaille
cfd682f46d d, java: compute static subtractions
* data/d.m4, data/java.m4: Use b4_subtract where appropriate.
2018-12-05 06:02:01 +01:00
Akim Demaille
f15382f7d7 d: add an example
* examples/d/calc.test, examples/d/calc.y, examples/d/local.mk:
2018-12-04 20:43:01 +01:00
Akim Demaille
0ebcae4a54 d: update the skeleton
* data/d.m4, data/lalr1.d: Catch up with Bison.
And actually, also catch up with D.
2018-12-04 20:43:01 +01:00
Akim Demaille
22b2c286ff d: add experimental support for the D language
* configure.ac (ENABLE_D): New.
* src/getargs.c (valid_languages): Add d.
2018-12-04 20:29:33 +01:00
Akim Demaille
4a42a4f911 d: add skeleton for the D language
Contributed by Oliver Mangold.
https://lists.gnu.org/archive/html/help-bison/2012-01/msg00000.html

* README-D.txt, d-skel.m4, d.m4, lalr1.d: New.
2018-12-04 20:29:28 +01:00
Akim Demaille
c20dd6279f examples: regenerate them when version.texi changes
When we extract the examples from the documentation, %require
"@value{VERSION}" is replaced with the current version.  If we change
the git branch, without changing the documentation, the generated
examples will %require a version of Bison that differs from the actual
version.

* examples/local.mk (extracted.stamp): Depend on doc/version.texi.
2018-12-04 08:36:52 +01:00
Akim Demaille
999277ddd8 skeletons: start some technical documentation
* data/README: Convert to Markdown.
Start documenting some of the macros used in all our skeletons.
Simplify and fix the documentation of the macros in the skeletons.
2018-12-04 08:36:52 +01:00
Akim Demaille
f539a56620 regen 2018-12-03 18:42:00 +01:00
Akim Demaille
c44a782a4e backend: revamp the handling of symbol types
Currently it is the front end that passes the symbol types to the
backend.  For instance:

  %token <ival> NUM
  %type <ival> exp1 exp2
  exp1: NUM { $$ = $1; }
  exp2: NUM { $<ival>$ = $<ival>1; }

In both cases, $$ and $1 are passed to the backend as having type
'ival' resulting in code like `val.ival`.  This is troublesome in the
case of api.value.type=union, since in that the case the code this:

  %define api.value.type union
  %token <int> NUM
  %type <int> exp1 exp2
  exp1: NUM { $$ = $1; }
  exp2: NUM { $<int>$ = $<int>1; }

because in this case, since the backend does not know the symbol being
processed, it is forced to generate casts in both cases: *(int*)(&val)`.
This is unfortunate in the first case (exp1) where there is no reason
at all to use a cast instead of `val.NUM` and `val.exp1`.

So instead delegate the computation of the actual value type to the
backend: pass $<ival>$ as `symbol-number, ival` and $$ as
`symbol-number, MULL`, instead of passing `ival` before.

* src/scan-code.l (handle_action_dollar): Find the symbol the action
is about, not just its tyye.  Pass both symbol-number, and explicit
type tag ($<tag>n when there is one) to b4_lhs_value and b4_rhs_value.

* data/bison.m4 (b4_symbol_action): adjust to the new signature to
b4_dollar_pushdef.

* data/c-like.m4 (_b4_dollar_dollar, b4_dollar_pushdef): Accept the
symbol-number as new argument.

* data/c.m4 (b4_symbol_value): Accept the symbol-number as new
argument, and use it.
(b4_symbol_value_union): Accept the symbol-number as new
argument, and use it to prefer ready a union member rather than
casting the union.
* data/yacc.c (b4_lhs_value, b4_rhs_value): Accept the new
symbol-number argument.
Adjust uses of b4_dollar_pushdef.
* data/glr.c (b4_lhs_value, b4_rhs_value): Adjust.

* data/lalr1.cc (b4_symbol_value_template, b4_lhs_value): Adjust
to the new symbol-number argument.
* data/variant.hh (b4_symbol_value, b4_symbol_value_template): Accept
the new symbol-number argument.

* data/java.m4 (b4_symbol_value, b4_rhs_data): New.
(b4_rhs_value): Use them.
* data/lalr1.java: Adjust to b4_dollar_pushdef, and use b4_rhs_data.
2018-12-03 18:40:26 +01:00
Akim Demaille
e40db8976c style: comment and formatting changes
* data/bison.m4, data/c++.m4, data/glr.c, data/java.m4, data/lalr1.cc,
* data/yacc.c, src/scan-code.l:
Fix comments.
Prefer POS to denote the position of a symbol in a rule, since NUM
is also used to denote symbol numbers.
2018-12-03 08:42:26 +01:00
Akim Demaille
d527b2d0f1 NEWS: update 2018-12-03 06:00:07 +01:00
Akim Demaille
1a27d0bf28 java: make sure the build dir exists
* examples/java/local.mk (%D%/Calc.java): here.
2018-12-03 05:45:11 +01:00
Akim Demaille
e76a934853 c++: don't define variant<S>, directly define semantic_type
Instead of defining yy::variant<S> and then alias
yy::parser::semantic_type to variant<sizeof (union_type)>, directly
define yy::parser::semantic_type.

This model is more appropriate if we want to sit the storage on top of
unions in C++11.

* data/variant.hh (b4_variant_define): Specialize and inline the
definition into...
(b4_value_type_declare): Here.
Define union_type here.
* data/lalr1.cc: Adjust.
2018-12-03 05:40:46 +01:00
Akim Demaille
7d823c505e NEWS: update 2018-12-01 17:29:04 +01:00
Akim Demaille
6ef788f810 C++: use noexcept and constexpr
There are probably more opportunities for them.
So far, I observed no performance improvements.

* data/c++.m4, data/lalr1.cc, data/stack.hh: here.
2018-12-01 12:54:42 +01:00
Akim Demaille
cc422ce677 CI: also display the examples' test suite log
* .travis.yml: here.
2018-12-01 11:13:08 +01:00
Akim Demaille
d2386a35f5 java: add an example
* examples/java/Calc.y: New, based on test 495: "Calculator
parse.error=verbose %locations".
* examples/java/Calc.test, examples/java/local.mk: New.

* configure.ac (ENABLE_JAVA): New.
* examples/test (prog): Be ready to run Java programs.
2018-12-01 11:13:08 +01:00
Akim Demaille
3422ee7435 style: unsigned int -> unsigned
See
https://lists.gnu.org/archive/html/bison-patches/2018-08/msg00027.html

* src/output.c (muscle_insert_unsigned_int_table): Rename as...
(muscle_insert_unsigned_table): this.
2018-12-01 11:13:08 +01:00
Akim Demaille
e1094c4c09 output: restore yyrhs and yyprhs
This was demanded several times.  See for instance:

- David M. Warme
  https://lists.gnu.org/archive/html/help-bison/2011-04/msg00003.html

- box12009
  http://lists.gnu.org/archive/html/bug-bison/2016-10/msg00001.html

Basically, this reverts:

- commit 3d3bc1fe30
  Get rid of (yy)rhs and (yy)prhs

- commit d333175f63
  Avoid compiler warning.

Note that since these tables are not needed in the generated parsers,
no skeleton requests them.  This change only brings back their
definition to M4, making it possible to user-defined skeletons to use
these tables.

* src/output.c (muscle_insert_item_number_table): Define.
(prepare_rules): Generate the rhs and prhs tables.
2018-12-01 11:12:59 +01:00
Akim Demaille
060da943bd regen 2018-11-30 06:10:21 +01:00