Commit Graph

6062 Commits

Author SHA1 Message Date
Akim Demaille
0dc44adbf6 parsers: fix minor stylistic issues
* data/skeletons/variant.hh (b4_token_constructor_declare): Remove,
unused since the previous commit.
Fix indentation issues.
* data/skeletons/c++.m4: Fix indentation issues.
2018-12-27 18:23:49 +01:00
Akim Demaille
7938ab53ff c++: check several parsers in the same program
* tests/local.at (AT_LOCATION_TYPE_IF): Turn into...
(AT_LOCATION_TYPE_SPAN_IF): this.
Adjust dependencies.
* tests/headers.at (Several parsers): Add another C++ parser,
which uses the first C++ parser's locations.
2018-12-27 18:22:33 +01:00
Akim Demaille
5fb0d276b3 c++: variants: fuse declarations and definitions
We used to create a short definition of yy::parser with all the
implementations of its member functions outside.  But yy::parser is no
longer short and simple to read.  Maintaining each function twice is
painful: a lot of redundancy but different indentation levels, output
which depends on whether we are in a header or not (see
d132c2d545), etc.

Let's simplify this and put the implementations into the class
definition itself.

Discussed in this monologue:
https://lists.gnu.org/archive/html/bison-patches/2018-12/msg00058.html.

* data/skeletons/c++.m4, data/skeletons/lalr1.cc,
* data/skeletons/variant.hh (b4_basic_symbol_constructor_define)
(_b4_token_constructor_declare, b4_token_constructor_declare)
Merge into...
(b4_basic_symbol_constructor_define, _b4_token_constructor_define)
(b4_token_constructor_define): these.
2018-12-26 09:12:25 +01:00
Akim Demaille
50285ff066 examples: fix dependencies
Commit 112ccb5ed7 moved the skeletons
from dist_pkgdata_DATA to dist_skeletons_DATA, hence broke the dependencies.

* Makefile.am (dependencies): New.
Use it where appropriate.
2018-12-26 08:44:01 +01:00
Akim Demaille
f44fcd30ea c++: move stack<T> inside yy::parser
We used to define such auxiliary structures outside the class, mainly
as a matter of style to keep the definition of yy::parser short and
simple.  However, now there's a lot more code generated inside the
class definition (e.g., all the token constructors), so the
readability no longer applies.

However, if we move stack (and slice) inside yy::parser, then it
should no longer be needed to change the namespace to have multiple
parsers: changing the class name should suffice.

One common argument against inner classes is that they code bloat.  It
hardly applies here, since typically different parsers will have
different semantic value types, hence different actual stack types.

* data/skeletons/lalr1.cc: Invoke b4_stack_define inside yy::parser.
2018-12-26 08:24:38 +01:00
Akim Demaille
a4ede8f85b package: make bison a relocatable package
Suggested by David Barto
https://lists.gnu.org/archive/html/help-bison/2015-02/msg00004.html
and Victor Zverovich.
https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00121.html

This is very easy to do, thanks to work by Bruno Haible in gnulib.
See "Supporting Relocation" in gnulib's documentation.

* bootstrap.conf: We need relocatable-prog and relocatable-script (for yacc).

* src/yacc.in: New.
* configure.ac, src/local.mk: Instantiate it.
* src/main.c, src/output.c (main, pkgdatadir): Use relocatable2.

* doc/bison.texi (FAQ): Document it.
2018-12-25 10:05:36 +01:00
Akim Demaille
53f1a0b114 README: wrap paragraphs 2018-12-25 09:38:46 +01:00
Akim Demaille
ac0a0f5491 gnulib: update 2018-12-25 08:09:32 +01:00
Akim Demaille
112ccb5ed7 package: move skeletons into data/skeletons
* data/bison.m4, data/c++-skel.m4, data/c++.m4, data/c-like.m4,
* data/c-skel.m4, data/c.m4, data/d-skel.m4, data/d.m4, data/glr.c,
* data/glr.cc, data/java-skel.m4, data/java.m4, data/lalr1.cc,
* data/lalr1.d, data/lalr1.java, data/location.cc, data/stack.hh,
* data/variant.hh, data/yacc.c:
Move to...
* data/skeletons: here.
Use b4_skeletonsdir instead of b4_pkgdatadir.

* data/local.mk, src/output.c: Adjust.
2018-12-25 07:47:51 +01:00
Akim Demaille
0a4ddce822 c++: style: use consistently this/that instead of this/other
* data/lalr1.cc, data/variant.hh: here.
2018-12-24 19:05:00 +01:00
Akim Demaille
10591c8879 c++: also provide a copy constructor for symbol_type
Suggested by Wolfgang Thaller.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00081.html

* data/c++.m4 (basic_symbol, by_type): Instead of provide either move
or copy constructor, always provide the copy one.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check it.
2018-12-24 19:03:32 +01:00
Akim Demaille
807bf60cfc c++: fix double free when a symbol_type was moved
Currently the following piece of code crashes (with parse.assert),
because we don't record that s was moved-from, and we invoke its dtor.

    {
      auto s = parser::make_INT (42);
      auto s2 = std::move (s);
    }

Reported by Wolfgang Thaller.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00077.html

* data/c++.m4 (by_type): Provide a move-ctor.
(basic_symbol): Be sure not to read a moved-from value.
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Check this case.
2018-12-24 18:58:56 +01:00
Akim Demaille
45cd7dfb7f c++: style: improve tests
* tests/c++.at (C++ Variant-based Symbols Unit Tests): Provide better
assertions.
Use them.
Avoid useless Bison invocations.
2018-12-24 18:58:46 +01:00
Akim Demaille
9858165c52 c++: style: use consistently this/that instead of this/other
* data/c++.m4: here.
2018-12-24 18:55:24 +01:00
Akim Demaille
777b0163de tests: fixes
* tests/c++.at: Use AT_YYLEX_PROTOTYPE etc.
Which requires that we use the same argument names (lvalp, etc.).
* tests/local.at (AT_NAME_PREFIX): Fix regex.
2018-12-23 10:40:07 +01:00
Akim Demaille
0978148763 c++: style: rename a few macros for clarity
* data/c++.m4, data/lalr1.cc, data/variant.hh:
s/b4_symbol_constructor/b4_token_constructor/g, as this is really what
is being defined.
2018-12-22 18:28:23 +01:00
Akim Demaille
e5780041b9 c++: exhibit a safe symbol_type
Instead of introducing make_symbol (whose name, btw, somewhat
infringes on the user's "name space", if she defines a token named
"symbol"), let's make the construction of symbol_type safer, using
assertions.

For instance with:

    %token ':' <std::string> ID <int> INT;

generate:

    symbol_type (int token, const std::string&);
    symbol_type (int token, const int&);
    symbol_type (int token);

It does mean that now named token constructors (make_ID, make_INT,
etc.) go through a useless assert, but I think we can ignore this: I
assume any decent compiler will inline the symbol_type ctor inside the
make_TOKEN functions, which will show that the assert is trivially
verified, hence I expect no code will be emitted for it.  And anyway,
that's an assert, NDEBUG controls it.

* data/c++.m4 (symbol_type): Turn into a subclass of
basic_symbol<by_type>.
Declare symbol constructors when variants are enabled.
* data/variant.hh (_b4_type_constructor_declare)
(_b4_type_constructor_define): Replace with...
(_b4_symbol_constructor_declare, _b4_symbol_constructor_def): these.
Generate symbol_type constructors.
* doc/bison.texi (Complete Symbols): Document.
* tests/types.at: Check.
2018-12-22 14:55:07 +01:00
Akim Demaille
1f4dd2671a c++: provide symbol constructors per type
On

    %token <int> FOO BAR

we currently generate make_FOO(int) and make_BAR(int).  However, in
order to factor their scanners, some users would also like to have
make_symbol(tok, int), where tok is FOO or BAR.  To ensure type
safety, add assertions that do check that value type and token type
match.  Bind this assertion to the parse.assert %define variable.

Suggested by Frank Heckenbach.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00034.html
Should also match expectations from Аскар Сафин.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00023.html

* data/variant.hh: Use b4_token_visible_if where applicable.
(_b4_type_constructor_declare, _b4_type_constructor_define): New.
Use them.
2018-12-22 13:05:28 +01:00
Akim Demaille
34c52d10ac c++: style changes
* data/c++.m4, data/variant.hh: Improve layout of the generated code.
Avoid casts.
(_b4_symbol_constructor_declare, _b4_symbol_constructor_define): Rename
as...
(_b4_token_maker_declare, _b4_token_maker_define): these.
* tests/types.at: Improve pair printing.
2018-12-22 13:05:28 +01:00
Akim Demaille
a8e66fc010 style: simplify tests
* tests/types.at: Simplify C++ instantiations.
2018-12-22 11:28:59 +01:00
Akim Demaille
6e9f9fcafc style: use b4_token_visible_if
And other formatting/comment changes.

* data/variant.hh: Here.
2018-12-19 07:23:40 +01:00
Akim Demaille
b04492cc5f NEWS: update 2018-12-19 07:23:05 +01:00
Akim Demaille
98d199ccc8 c++: fix token constructors for types with commas
Bitten by macros, again.
See 680b715518.

* data/variant.hh (_b4_symbol_constructor_declare)
(_b4_symbol_constructor_define): Do not use user types, which can
include commas as in `std::pair<int, int>`, to macros.

* tests/local.at: Adjust the lex related macros to support the
case of token constructors.
* tests/types.at: Also check token constructors on types with commas.
2018-12-19 06:40:28 +01:00
Akim Demaille
93cc1fa6e8 gnulib: update 2018-12-19 06:40:28 +01:00
Akim Demaille
1099b8dc26 symbols: document the overhaul of symbol declarations
* doc/bison.texi (Symbol Decls): New.
2018-12-17 05:57:17 +01:00
Akim Demaille
8e5b1f40ae symbols: check more invalid declarations
* tests/input.at (Invalid %nterm uses): Rename as...
(Invalid symbol declarations): this.
Extend.
2018-12-16 12:27:28 +01:00
Akim Demaille
25b9eada8c symbols: check the previous commit
* tests/input.at (Symbol declarations): New.
2018-12-16 12:27:28 +01:00
Akim Demaille
dbe499e936 regen 2018-12-16 12:27:28 +01:00
Akim Demaille
1d5956f87f symbols: clean up their parsing
Prompted by Rici Lake.
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html

We have four classes of directives that declare symbols: %nterm,
%type, %token, and the family of %left etc.  Currently not all of them
support the possibility to have several type tags (`<type>`), and not
all of them support the fact of not having any type tag at all
(%type).  Let's unify this.

- %type
  POSIX Yacc specifies that %type is for nonterminals only.  However,
  some Bison users want to use it for both tokens and nterms
  (actually, Bison's own grammar does this in several places, e.g.,
  CHAR).  So it should accept char/string literals.

  As a consequence cannot be used to declare tokens with their alias:
  `%type foo "foo"` would be ambiguous (are we defining foo = "foo",
  or are these two different symbols?)

  POSIX specifies that it is OK to use %type without a type tag.  I'm
  not sure what it means, but we support it.

- %token
  Accept token declarations with number and string literal:
  (ID|CHAR) NUM? STRING?.

- %left, etc.
  They cannot be the same as %token, because we accept to declare the
  symbol with %token, and to then qualify its precedence with %left.
  Then `%left foo "foo"` would also be ambiguous: foo="foo", or two
  symbols.

  They cannot be simply a list of identifiers, but POSIX Yacc says we
  can declare token numbers here.  I personally think this is a bad
  idea, precedence management is tricky in itself and should not be
  cluttered with token declaration issues.

  We used to accept declaring a token number on a string literal here
  (e.g., `%left "token" 1`).  This is abnormal.  Either the feature is
  useful, and then it should be supported in %token, or it's useless
  and we should not support it in corner cases.

- %nterm
  Obviously cannot accept tokens, nor char/string literals.  Does not
  exist in POSIX Yacc, but since %type also works for terminals, it is
  a nice option to have.

* src/parse-gram.y: Avoid relying on side effects.  For instance, get
rid of current_type, rather, build the list of symbols and iterate
over it to assign the type.
It's not always possible/convenient.  For instance, we still use
current_class.
Prefer "decl" to "def", since in the rest of the implementation we
actually "declare" symbols, we don't "define" them.
(token_decls, token_decls_for_prec, symbol_decls, nterm_decls): New.
Use them for %token, %left, %type and %nterm.
* src/symlist.h, src/symlist.c (symbol_list_type_set): New.
* tests/regression.at b/tests/regression.at
(Token number in precedence declaration): We no longer accept
to give a number to string literals.
2018-12-16 12:27:28 +01:00
Akim Demaille
fdceb6330f symbols: set tag_seen when assigning a type to symbols
* src/reader.h, src/reader.c (tag_seen): Move to...
* src/symtab.h, src/symtab.c: here.
(symbol_type_set): Set it to true.
* src/parse-gram.y: Don't.
2018-12-15 17:41:25 +01:00
Akim Demaille
bc31dee0f7 tests: isolate test about Yacc warnings
* tests/input.at (Yacc warnings): New.
(AT_CHECK_UNUSED_VALUES): Remove checks about yacc.
2018-12-14 05:10:31 +01:00
Akim Demaille
465a47d46b parser: warn about string literals in Yacc mode
* src/scan-gram.l (scan_integer): Warn.
* tests/input.at (Yacc warnings on symbols): Check.
2018-12-14 05:10:31 +01:00
Akim Demaille
953a95695a parser: warn about hexadecimal token numbers in Yacc mode
* src/scan-gram.l (scan_integer): Warn.
* tests/input.at (Yacc warnings on symbols): Check.
2018-12-14 05:10:31 +01:00
Akim Demaille
aadf6c0bf3 parser: reprecate %nterm back
After having spent quite some time on cleaning the handling of symbol
declarations in the grammar files, I believe we should keep it.

It looks like it's a duplicate of %type, but it is not.  While POSIX
Yacc requires %type to apply only to nonterminal symbols, it appears
that both byacc and bison accept it for tokens too.  And some
experienced users do actually expect this feature to group
symbols (terminal or not) by type ("On the other hand, it is generally
more useful IMHO to group terminals and non-terminals with the same
type tag together",
http://lists.gnu.org/archive/html/bug-bison/2018-10/msg00000.html).
Even Bison's own parser does this today (see CHAR).

Basically reverts 7928c3e6fb.

* src/scan-gram.l (%nterm): Dedeprecate, but issue a Wyacc warning.
* tests/input.at: Adjust expectations.
(Yacc warnings  on symbols): New.
* src/symtab.c (symbol_class_set): Fix error introduced in
20b0746793.
2018-12-14 05:10:18 +01:00
Eduard Staniloiu
dbb855895f CI: add dmd support
* .travis.yml: here.
2018-12-11 07:06:12 +01:00
Akim Demaille
d68f05d75c style: s/non-terminal/nonterminal/
I personally prefer 'non terminal', or 'non-terminal', but
'nonterminal' is the common spelling.

* data/glr.c, src/parse-gram.y, src/symtab.c, src/symtab.h,
* tests/input.at, doc/refcard.tex: here.
2018-12-11 06:55:41 +01:00
Akim Demaille
b05aa7be2e style: rename error functions for clarity
* src/symtab.c (symbol_redeclaration, semantic_type_redeclaration)
(user_token_number_redeclaration):
Rename as...
(complain_symbol_redeclared, complain_semantic_type_redeclared)
(complain_user_token_number_redeclared):
this.
2018-12-11 06:55:35 +01:00
Akim Demaille
20b0746793 parser: improve the error message for symbol class redefinition
Currently our error messages include both "symbol redeclared" and
"symbol redefined", and they mean something different.  This is
obscure, let's make this clearer.

I think the idea between 'definition' vs. 'declaration' is that in the
case of the nonterminals, the actual definition is its set of rules,
so %nterm would be about declaration.  The case of %token is less
clear.

* src/symtab.c (complain_class_redefined): New.
(symbol_class_set): Use it.
Simplify the logic of this function to clearly skip its body when the
preconditions are not met.
* tests/input.at (Symbol class redefinition): New.
2018-12-11 06:53:25 +01:00
Akim Demaille
afdefecab6 examples: simplify computation of yydebug
* examples/c/lexcalc/parse.y: here.
2018-12-11 06:53:25 +01:00
Akim Demaille
81dbd0d82e C++: support variadic emplace
Suggested by Askar Safin.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00006.html

* data/variant.hh: Implement.
* tests/types.at: Check.
* doc/bison.texi: Document.
2018-12-10 17:50:12 +01:00
Akim Demaille
d657da9fb4 examples: add a simple Flex+Bison example in C
Suggested by Askar Safin.
http://lists.gnu.org/archive/html/bug-bison/2018-12/msg00003.html

* examples/c/lexcalc/Makefile, examples/c/lexcalc/README.md,
* examples/c/lexcalc/lexcalc.test, examples/c/lexcalc/local.mk,
* examples/c/lexcalc/parse.y, examples/c/lexcalc/scan.l:
New.
2018-12-09 15:30:25 +01:00
Akim Demaille
4cbdcaa572 regen 2018-12-09 13:55:05 +01:00
Akim Demaille
85d303b713 examples: sort them per language and complete them
Convert some of the READMEs to Markdown, which is now more common, and
nicely displayed in some git hosting services.

Add missing READMEs and Makefiles.  Generate XML, HTML and Dot files.  Be
sure to ship the test files.  Complete CLEANFILES to remove all generated
files.

* examples/calc++: Move into...
* examples/c++: here.
* examples/mfcalc, examples/rpcalc: Move into...
* examples/c: here.

* examples/README.md, examples/c++/calc++/Makefile, examples/c/local.mk,
* examples/c/mfcalc/Makefile, examples/c/rpcalc/Makefile,
* examples/d/README.md, examples/java/README.md:
New files.

* examples/test (medir): Be robust to deeper directory nesting.
2018-12-09 13:55:05 +01:00
Akim Demaille
1e6a68858a regen 2018-12-09 12:50:53 +01:00
Akim Demaille
17730b0287 parser: minor refactoring
* src/parse-gram.y (symbol.prec): Reuse int.opt.
2018-12-09 12:50:53 +01:00
Akim Demaille
157f12c483 parser: move checks inside the called functions
Revamping the handling of the symbols is the grammar is much more
delicate than I anticipated.  Let's first move things around for
clarity.

* src/symtab.c (symbol_make_alias): Don't accept to alias
non-terminals.
(symbol_user_token_number_set): Don't accept user token numbers
for non-terminals.
Don't do anything in case of redefinition, instead of trying to
update.  The flow is eaier to follow this way.
2018-12-09 12:50:53 +01:00
Akim Demaille
e1a843cc69 d: fix double definition of YYSemanticType
* data/lalr1.d: When moving to b4_user_union_members, it also defines
b4_tag_seen_flag, so we had two definitions.
2018-12-08 08:05:00 +01:00
Akim Demaille
fe97793659 gnulib: update 2018-12-08 07:50:43 +01:00
Akim Demaille
401afe5cc2 parser: fix incorrect condition to raise a syntax error
* src/parse-gram.y (symbol_def): Fix test.
2018-12-06 17:50:54 +01:00
Akim Demaille
10e3ccac05 d: fix use of b4_union_members
* data/lalr1.d: Use b4_user_union_members instead.
2018-12-06 06:27:33 +01:00