Commit Graph

6935 Commits

Author SHA1 Message Date
Akim Demaille
e364bcdbc5 doc: c++: promote api.token.raw
* doc/bison.texi (Calc++ Parser): Here.
2020-03-23 07:02:32 +01:00
Akim Demaille
5a8db8a739 bench: calc: no need for super long inputs
* etc/bench.pl.in ($iterations): Restore initial value, -1, meaning
"at least one second".
($calc_input): There is no need to generate 400 lines.
2020-03-22 15:59:22 +01:00
Akim Demaille
5acc29041e bench: calc: work on a string instead of a file
The cost of the file layer is large and makes benchmarks too coarse,
as seen for in following example, first with a file, then with a
literal string:

    0. %skeleton "yacc.c" %define parse.lac full
    1. %skeleton "yacc-v1.c" %define nofinal %define parse.lac full
    2. %skeleton "yacc-v2.c" %define nofinal %define parse.lac full
    3. %skeleton "yacc-v3.c" %define nofinal %define parse.lac full
    4. %skeleton "yacc.c"
    5. %skeleton "yacc-v1.c" %define nofinal
    6. %skeleton "yacc-v2.c" %define nofinal
    7. %skeleton "yacc-v3.c" %define nofinal
    --------------------------------------------------
    Benchmark           Time           CPU Iterations
    --------------------------------------------------
    BM_y0           32558 ns      32537 ns      21228
    BM_y1           32400 ns      32369 ns      21233
    BM_y2           33485 ns      33464 ns      20625
    BM_y3           32139 ns      32125 ns      21446
    BM_y4           31343 ns      31329 ns      21747
    BM_y5           31344 ns      31317 ns      22035
    BM_y6           31287 ns      31255 ns      22039
    BM_y7           31387 ns      31373 ns      22178
    --------------------------------------------------
    Benchmark           Time           CPU Iterations
    --------------------------------------------------
    BM_y0           10642 ns      10634 ns      63601
    BM_y1           10657 ns      10654 ns      63625
    BM_y2           10441 ns      10432 ns      65957
    BM_y3           10558 ns      10554 ns      64546
    BM_y4            9521 ns       9516 ns      72011
    BM_y5            9179 ns       9157 ns      75028
    BM_y6            9360 ns       9356 ns      73770
    BM_y7            9365 ns       9359 ns      72609

Of course, at the same time it is less realistic: most users read
files rather that strings, so it might lead to us to pay attention to
costs most people don't see.

* etc/bench.pl.in (&calc_input): Output into a file given as argument.
Output in C syntax.
(&generate_grammar_calc): Use it.
Simplify the grammar: remove operators we don't care about.
Rewrite the scanner to work on a char* instead of a FILE*.
2020-03-22 15:59:22 +01:00
Akim Demaille
5b0b0a1e08 bench: add a "latest" symlink
* etc/bench.pl.in: here.
2020-03-22 15:59:14 +01:00
Akim Demaille
1c694e08cc bench: use the same prefix in both bench methods
* etc/bench.pl.in (&bench_with_timethese): Also use y$i, as in
&bench_with_gbenchmark.
(&generate_grammar_calc): Don't add a prefix, let the callers do it.
2020-03-22 15:59:13 +01:00
Akim Demaille
4cfb067d93 bench: use a C++-11 compiler
See https://github.com/google/benchmark#a-faster-keeprunning-loop.

* etc/bench.pl.in ($cxx): Be C++11.
(&bench_with_gbenchmark): Adjust.
2020-03-22 15:59:13 +01:00
Akim Demaille
cf60d0a617 bench: create a README file with benches
* etc/bench.pl.in (&bench_with_gbenchmark): Here.
2020-03-22 15:59:13 +01:00
Akim Demaille
c0e8489605 bench: calc: add support for google benchmark
* etc/bench.pl.in (&compiler): New, extracted from...
(&compile): here.
Don't link when using gbm.
(&calc_input): Don't make massive input for micro
benchmarks.
(&generate_grammar_calc): When using gbm, use api.prefix to avoid name
collisions.
Be ready to issue BENCHMARKS instead of a main.
(&bench): Rename as...
(&bench_with_timethese): this.
(&bench_with_gbenchmark): New.
(&bench): New.
Dispatch on these two.
2020-03-21 18:19:14 +01:00
Akim Demaille
788b1a6858 bench: better error messages on invalid input
* etc/bench.pl.in: here.
2020-03-21 18:17:09 +01:00
Akim Demaille
56414791e9 bench: simplify the calc grammar
* etc/bench.pl.in (generate_grammar_calc): We don't need global_result
etc.
2020-03-21 18:17:02 +01:00
Akim Demaille
675dcf1962 bench: die clearly on incorrect --grammar arguments
* etc/bench.pl.in (getopt): here.
2020-03-21 14:52:41 +01:00
Akim Demaille
466fb66578 regen 2020-03-17 19:21:24 +01:00
Akim Demaille
cbb967dbad yacc.c: style: prefer switch to if
* data/skeletons/yacc.c: Prefer switch to decode yy_lac's return value.
2020-03-17 19:21:07 +01:00
Akim Demaille
44ac18d136 yacc.c: yypstate_expected_tokens
In push parsers, when asking for the list of expected tokens at some
point, it makes no sense to build a yyparse_context_t: the yypstate
alone suffices (the only difference being the lookahead).  Instead of
forcing the user to build a useless shell around yypstate, let's offer
yypstate_expected_tokens.

See https://lists.gnu.org/r/bison-patches/2020-03/msg00025.html.

* data/skeletons/yacc.c (yypstate): Declare earlier, so that we can
use it for...
(yypstate_expected_tokens): this new function, when in push parsers.
Adjust dependencies.
* examples/c/bistromathic/parse.y: Simplify: use
yypstate_expected_tokens.
Style fixes.
Reduce scopes (reported by Joel E. Denny).
2020-03-17 19:20:13 +01:00
Akim Demaille
0c3dd3a669 examples: bistromathic: simplify
* examples/c/bistromathic/parse.y (expected_tokens): Remove useless "break".
2020-03-09 07:24:33 +01:00
Akim Demaille
951da960e6 merge branch 'maint'
* upstream/maint:
  maint: post-release administrivia
  version 3.5.3
  news: update for 3.5.3
  yacc.c: make sure we properly propagated the user's number for error
  diagnostics: don't crash because of repeated definitions of error
  style: initialize some struct members
  diagnostics: beware of zero-width characters
  diagnostics: be sure to close the styling when lines are too short
  muscles: fix incorrect decoding of $
  code: be robust to reference with invalid tags
  build: fix typo
  doc: update recommandation for libtextstyle
  style: comment changes
  examples: use consistently the GFDL header for readmes
  style: remove useless declarations
  typo: succesful -> successful
  README: point to tests/bison, and document --trace
  gnulib: update
  maint: post-release administrivia
2020-03-08 10:13:16 +01:00
Akim Demaille
15ea35019f maint: post-release administrivia
* NEWS: Add header line for next release.
* .prev-version: Record previous version.
* cfg.mk (old_NEWS_hash): Auto-update.
2020-03-08 08:50:10 +01:00
Akim Demaille
f49684a577 version 3.5.3
* NEWS: Record release date.
v3.5.3
2020-03-08 08:30:41 +01:00
Akim Demaille
044ad1288c news: update for 3.5.3 2020-03-08 08:17:13 +01:00
Akim Demaille
e3812bb8c3 yacc.c: make sure we properly propagated the user's number for error
* data/skeletons/yacc.c (YYERRCODE): Be truthful.
* tests/input.at (Redefining the error token): Check that.
2020-03-08 08:10:11 +01:00
Akim Demaille
cfcd823e16 diagnostics: don't crash because of repeated definitions of error
According to https://www.unix.com/man-page/POSIX/1posix/yacc/, the
user is allowed to specify her user number for the error token:

    The token error shall be reserved for error handling. The name
    error can be used in grammar rules. It indicates places where the
    parser can recover from a syntax error. The default value of error
    shall be 256. Its value can be changed using a %token
    declaration. The lexical analyzer should not return the value of
    error.

I think this feature is useless, the user should not have to deal with
that.  The intend is probably to give the user a means to use 256 if
she wants to, but provided "error" cleared the path first by being
assigned another number.  In the case of Bison, 256 is assigned to
"error" at the end if the user did not use it for a token of hers.  So
this feature is useless.

Yet it is valid, and if the user assigns twice a token number to
"error", then the second time we want to complain about it and want to
show the original definition.  At this point, we try to display the
built-in definition of "error", whose location is NULL, and we crash.

Rather, the location of the first user definition of "error" should
become its defining location.

Reported byg Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00007.html

* src/symtab.c (symbol_class_set): If this is a declaration and the
symbol was not declared yet, keep this as defining location.
* tests/input.at (Redefining the error token): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
2f02d9beae style: initialize some struct members
* src/symtab.c (sym_content_new): Initialize all the location members.
Not needed by the code, but disturbing values when using a debugger.
2020-03-08 08:10:11 +01:00
Akim Demaille
b638603477 diagnostics: beware of zero-width characters
Currenly we rely on (visual) width of the characters to decide where
to open and close the styling of the quoted lines.  This breaks when
we deal with zero-width characters: we cannot just rely on (visual)
columns, we need to know whether we are before, inside, or after the
highlighted portion.

* src/location.c (location_caret): col_end: no longer add 1, "regular"
characters have a width of 1, only 0-width characters have 0-width.
opened: replace with 'state', a three-valued enum.
Don't reopen the style if we already did.
* tests/diagnostics.at (Zero-width characters): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
e21ff47f5d diagnostics: be sure to close the styling when lines are too short
bar.y:4.12-17: <error>error:</error> redefining user token number of foo
    -    4 | %token foo <error>123
    +    4 | %token foo <error>123</error>
           |            <error>^~~~~~</error>

* src/location.c (location_caret): Be sure to close.
* tests/diagnostics.at (Line is too short, and then you die): New.
2020-03-07 10:01:52 +01:00
Akim Demaille
b82b387da9 muscles: fix incorrect decoding of $
Bug introduced in 458171e6df.
https://lists.gnu.org/archive/html/bison-patches/2013-11/msg00009.html

Reported by Ahcheong Lee.
https://lists.gnu.org/r/bug-bison/2020-03/msg00010.html

* src/muscle-tab.c (COMMON_DECODE): "$" is coded as "$][", not "$[][".
* tests/input.at ("%define" enum variables): Check that case.
2020-03-07 07:45:10 +01:00
Akim Demaille
641e326303 code: be robust to reference with invalid tags
Because we want to support $<a->b>$, we must accept -> in type tags,
and reject $<->$, as it is unfinished.
Reported by Ahcheong Lee.

* src/scan-code.l (yylex): Make sure "tag" does not end with -, since
-> does not close the tag.
* tests/input.at (Stray $ or @): Check this.
2020-03-06 17:29:26 +01:00
Akimn Demaille
192e9fdf77 build: fix typo
* build-aux/cross-options.pl: here.
2020-03-06 08:32:26 +01:00
Akim Demaille
a4a3f08c11 doc: update recommandation for libtextstyle
* README: here.
2020-03-06 08:32:18 +01:00
Akim Demaille
666df338a7 style: comment changes
* src/symtab.h, src/lr0.c: here.
2020-03-06 08:32:03 +01:00
Akim Demaille
b437b16603 examples: use consistently the GFDL header for readmes
* examples/c++/README.md, examples/c++/calc++/README.md,
* examples/c/calc/README.md, examples/c/lexcalc/README.md,
* examples/c/reccalc/README.md:
Prefer the GFDL banner to the GPL one.
2020-03-06 08:31:34 +01:00
Akim Demaille
b493c173c9 style: remove useless declarations
* src/reader.h: Don't duplicate what parse-gram.h already exposes.
* src/lr0.h: Remove useless include.
2020-03-06 08:30:21 +01:00
Adrian Vogelsgesang
aab3feb5a1 typo: succesful -> successful
* data/skeletons/lalr1.cc: here
* etc/bench.pl.in: here
* src/location.c: and here.
2020-03-06 08:29:58 +01:00
Akim Demaille
b7942f2661 README: point to tests/bison, and document --trace
Reported by Victor Morales Cayuela.

* README, README-hacking.md: here.
2020-03-06 08:28:23 +01:00
Akim Demaille
cefb538ab0 gnulib: update 2020-03-06 08:25:52 +01:00
Akim Demaille
ecd922024e README: point to tests/bison, and document --trace
Reported by Victor Morales Cayuela.

* README, README-hacking.md: here.
2020-03-05 17:56:55 +01:00
Akim Demaille
2353ce7216 yacc.c: simplify yyparse_context_t member names
* data/skeletons/yacc.c (yyparse_context_t): Rename yyes_p and
yyes_capacity_p as...
(yyes, yyes_capacity): These.
2020-03-05 07:26:50 +01:00
Akim Demaille
9cc76ee62c yacc.c: yyerror_range does not need to be preserved accross calls
* data/skeletons/yacc.c (b4_parse_state_variable_macros): Don't define
yyerror_range.
(yyparse): Add yyerror_range as local variable.
2020-03-05 07:26:49 +01:00
Akim Demaille
2f83ef57f3 yacc.c: push: undefine the pstate macros for the epilogue
* data/skeletons/yacc.c (b4_macro_define, b4_macro_undef)
(b4_pstate_macro_define, b4_parse_state_variable_macros):
New.
Use them.
* examples/c/bistromathic/parse.y: Remove now useless undefs.
2020-03-05 07:26:49 +01:00
Akim Demaille
744171ddbf yacc.c: push: initialize the pstate variables in pstate_new
Currently pstate_new does not set up its variables, this task is left
to yypush_parse.  This was probably to share more code with usual pull
parsers, where these (local) variables are indeed initialized by
yyparse.

But as a consequence yyexpected_tokens crashes at the very beginning
of the parse, since, for instance, the stacks are not even set up.
See https://lists.gnu.org/r/bison-patches/2020-03/msg00001.html.

The fix could have very simple, but the documentation actually makes
it very clear that we can reuse a pstate for several parses:

    After yypush_parse returns a status other than YYPUSH_MORE, the
    parser instance yyps may be reused for a new parse.

so we need to restore the parser to its pristine state so that (i) it
is ready to run the next parse, (ii) it properly supports
yyexpected_tokens for the next run.

* data/skeletons/yacc.c (b4_initialize_parser_state_variables): New,
extracted from the top of yyparse/yypush_parse.
(yypstate_clear): New.
(yypstate_new): Use it when push parsers are enabled.
Define after the yyps macros so that we can use the same code as the
regular pull parsers.
(yyparse): Use it when push parsers are _not_ enabled.

* examples/c/bistromathic/bistromathic.test: Check the completion on
the beginning of the line.
2020-03-05 07:13:23 +01:00
Akim Demaille
4fd3282dd7 style: formatting changes
* data/skeletons/yacc.c, tests/torture.at: here.
2020-03-04 08:24:36 +01:00
Akim Demaille
67793793e8 bistromathic: properly compute the lcp, as expected by readline
Currently completion on "at" proposes only "atan", but does not
actually complete "at" into "atan".

* examples/c/bistromathic/parse.y (completion): Install the lcp in
matches[0].
* examples/c/bistromathic/bistromathic.test: Check that case.
2020-03-04 08:24:36 +01:00
Akim Demaille
f334775dbf bistromathic: don't require spaces after operators for completion
Currently "(1+<TAB>" does not work as expected, because "+" is not a
word breaking character.

* examples/c/bistromathic/parse.y (init_readline): Specify our word
breaking characters.
* examples/c/bistromathic/bistromathic.test: Avoid trailing spaces.
2020-03-04 08:24:35 +01:00
Akim Demaille
feb1011c8b bistromathic: check completion
* examples/c/bistromathic/bistromathic.test: here.
* examples/c/bistromathic/parse.y (expected_tokens): Fix a memory
leak.
2020-03-02 06:58:25 +01:00
Akim Demaille
99bedadf23 m4: remove b4_function_define and b4_function_declare
* data/skeletons/c.m4: here.
2020-03-02 06:58:20 +01:00
Akim Demaille
79c3f2b8fd m4: decommission b4_function_declare
* data/skeletons/glr.c, data/skeletons/glr.cc, data/skeletons/yacc.c:
Stop using b4_function_declare.
2020-03-02 06:58:14 +01:00
Akim Demaille
4cca30d2e6 m4: decommission function generating macro
These macros have been extremely useful when we had to support K&R C,
which we dropped long ago.  Now, they merely make the code uselessly
hard to read.

* data/skeletons/c.m4, data/skeletons/glr.c, data/skeletons/glr.cc,
* data/skeletons/yacc.c:
Stop using b4_function_define.
2020-03-02 06:57:50 +01:00
Akim Demaille
5789f9d91e examples: bistromathic: demonstrate the use of yyexpected_tokens
Let's use GNU readline and its TAB autocompletion to demonstrate the
use of yyexpected_tokens.

This shows a number of weaknesses in our current approach:

- some macros (yyssp, etc.) from push parsers "leak" in user code, we
  need to undefine them

- the context needed by yyexpected_tokens does not need the token,
  yypstate actually suffices

- yypstate is not properly setup when first allocated, which results
  in a crash of yyexpected_tokens if fired before a first token was
  read.  We should move initialization from yypush_parse into
  yypstate_new.

* examples/c/bistromathic/parse.y (yylex): Take input as a string, not
a file.
(EXIT): New token.
(input): Adjust to work only on a line.
(line): Remove.
(symbol_count, process_line, expected_tokens, completion)
(init_readline): New.
* examples/c/bistromathic/bistromathic.test: Adjust expectations.
2020-03-01 12:31:39 +01:00
Akim Demaille
b269a45fa4 examples: use consistently the GFDL header for readmes
* examples/c++/README.md, examples/c++/calc++/README.md,
* examples/c/calc/README.md, examples/c/lexcalc/README.md,
* examples/c/pushcalc/README.md, examples/c/reccalc/README.md:
Prefer the GFDL banner to the GPL one.
2020-03-01 07:56:02 +01:00
Akim Demaille
1478fccd23 gnulib: use readline 2020-03-01 06:23:49 +01:00
Akim Demaille
535281f0ff examples: bistromathic: don't use Flex
This example will soon use GNU readline, so its scanner should be easy
to use (concurrently) on strings, not streams.  This is not a place
where Flex shines, and anyway, these are examples of Bison, not Flex.
There's already lexcalc and reccalc that demonstrate the use of Flex.

* examples/c/bistromathic/scan.l: Remove.
* examples/c/bistromathic/parse.y (yylex): New.
Adjust dependencies.
2020-02-29 17:52:08 +01:00