Commit Graph

42 Commits

Author SHA1 Message Date
Akim Demaille
2a7a2c1d3a tests: beware of mbswidth portability issues
Shy away from these issues on Cygwin.
Reported Denis Excoffier.
https://lists.gnu.org/r/bug-bison/2020-05/msg00003.html

* tests/diagnostics.at (Tabulations and multibyte characters): Split
in two.
2020-05-03 11:28:36 +02:00
Akim Demaille
758172a8b9 doc: spell check
* doc/bison.texi, NEWS, README-hacking.md: here.
And elsewhere.
2020-04-13 18:50:05 +02:00
Akim Demaille
a555b41990 diagnostics: replace "user token number" by "token code"
Yet, don't change the structure identifier to avoid introducing
conflicts in Vincent Imbimbo's PR (which, amusingly enough, is about
conflicts).

* src/symtab.c: here.
* tests/diagnostics.at, tests/input.at: Adjust.
2020-04-12 13:56:44 +02:00
Akim Demaille
951da960e6 merge branch 'maint'
* upstream/maint:
  maint: post-release administrivia
  version 3.5.3
  news: update for 3.5.3
  yacc.c: make sure we properly propagated the user's number for error
  diagnostics: don't crash because of repeated definitions of error
  style: initialize some struct members
  diagnostics: beware of zero-width characters
  diagnostics: be sure to close the styling when lines are too short
  muscles: fix incorrect decoding of $
  code: be robust to reference with invalid tags
  build: fix typo
  doc: update recommandation for libtextstyle
  style: comment changes
  examples: use consistently the GFDL header for readmes
  style: remove useless declarations
  typo: succesful -> successful
  README: point to tests/bison, and document --trace
  gnulib: update
  maint: post-release administrivia
2020-03-08 10:13:16 +01:00
Akim Demaille
b638603477 diagnostics: beware of zero-width characters
Currenly we rely on (visual) width of the characters to decide where
to open and close the styling of the quoted lines.  This breaks when
we deal with zero-width characters: we cannot just rely on (visual)
columns, we need to know whether we are before, inside, or after the
highlighted portion.

* src/location.c (location_caret): col_end: no longer add 1, "regular"
characters have a width of 1, only 0-width characters have 0-width.
opened: replace with 'state', a three-valued enum.
Don't reopen the style if we already did.
* tests/diagnostics.at (Zero-width characters): New.
2020-03-08 08:10:11 +01:00
Akim Demaille
e21ff47f5d diagnostics: be sure to close the styling when lines are too short
bar.y:4.12-17: <error>error:</error> redefining user token number of foo
    -    4 | %token foo <error>123
    +    4 | %token foo <error>123</error>
           |            <error>^~~~~~</error>

* src/location.c (location_caret): Be sure to close.
* tests/diagnostics.at (Line is too short, and then you die): New.
2020-03-07 10:01:52 +01:00
Victor Morales Cayuela
e09a72eeb0 diagnostics: modernize the display of submessages
Since Bison 2.7, output was indented four spaces for explanatory
statements.  For example:

    input.y:2.7-13: error: %type redeclaration for exp
    input.y:1.7-11:     previous declaration

Since the introduction of caret-diagnostics, it became less clear.
Remove the indentation and display submessages as in GCC:

    input.y:2.7-13: error: %type redeclaration for exp
        2 | %type <float> exp
          |       ^~~~~~~
    input.y:1.7-11: note: previous declaration
        1 | %type <int> exp
          |       ^~~~~

* src/complain.h (SUB_INDENT): Remove.
(warnings): Add "note" to the enum.
* src/complain.h, src/complain.c (complain_indent): Replace by...
(subcomplain): this.
Adjust all dependencies.
* tests/actions.at, tests/diagnostics.at, tests/glr-regression.at,
* tests/input.at, tests/named-refs.at, tests/regression.at:
Adjust expectations.
2020-02-15 08:28:40 +01:00
Akim Demaille
fc2191f137 diagnostics: modernize bison's syntax errors
We used to display the unexpected token first:

    $ bison foo.y
    foo.y:1.8-13: error: syntax error, unexpected %token, expecting character literal or identifier or <tag>
        1 | %token %token
          |        ^~~~~~

GCC uses a different format:

    $ gcc-mp-9 foo.c
    foo.c:1:5: error: expected identifier or '(' before ')' token
        1 | int()()()
          |     ^

and so does Clang:

    $ clang-mp-9.0 foo.c
    foo.c:1:5: error: expected identifier or '('
    int()()()
        ^
    1 error generated.

They display the unexpected token last (or not at all).  Also, they
don't waste width with "syntax error".  Let's try that.  It gives, for
the same example as above:

    $ bison foo.y
    foo.y:1.8-13: error: expected character literal or identifier or <tag> before %token
        1 | %token %token
          |        ^~~~~~

* src/complain.h, src/complain.c (syntax_error): New.
* src/parse-gram.y (yyreport_syntax_error): Use it.
2020-01-23 08:30:28 +01:00
Akim Demaille
46ab1d0cbe diagnostics: report syntax errors in color
* src/parse-gram.y (parse.error): Set to 'custom'.
(yyreport_syntax_error): New.
* data/bison-default.css (.expected, .unexpected): New.
* tests/diagnostics.at: Adjust.
2020-01-23 08:26:33 +01:00
Akim Demaille
c67daa9a97 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-10 19:16:23 +01:00
Akim Demaille
8036635251 package: bump copyrights to 2020
Run 'make update-copyright'.
2020-01-05 10:26:35 +01:00
Akim Demaille
583c193ffa tests: fix comment and adjust to locale names on GNU/Linux
Reported by Denis Excoffier.

* tests/diagnostics.at: here.
2019-11-03 10:32:22 +01:00
Akim Demaille
47b9ada6fa tests: really check complaints from m4
* tests/diagnostics.at (Locations from M4, Tabulations and multibyte
characters from M4): These tests are actually checking a message
coming from C, not from M4.  Replace with...
(Complaints from M4): This.
2019-11-03 10:32:22 +01:00
Akim Demaille
4b4e532748 diagnostics: use grammar_file instead of current_file
Currently there are two globals denoting the input file: grammar_file
is the one from the command line, and current_file which might change
because of #line.  Use only the former.

* src/complain.c (error_message): here.
* tests/diagnostics.at: Adjust.
2019-10-26 09:11:40 +02:00
Akim Demaille
c483b6593f tests: refactor the handling of Perl
Let's make a difference between places where Perl is required for the
test (AT_PERL_REQUIRE), and the places where it's used to run the
test, but it's not not to run the test (AT_PERL_CHECK).

* tests/local.at (AT_REQUIRE): New.
(AT_PERL_CHECK, AT_PERL_REQUIRE): New.
Use them where appropriate.

* tests/local.mk ($(TESTSUITE)): Beware not to start the line with
'-pi' if Perl is empty, as Make understands this as "it's ok to fail".
Which it is not.
2019-10-13 09:22:05 +02:00
Akim Demaille
0c56c195e0 tests: be really robust to Perl missing
My previous tests (with ./configure PERL=false) have been fooled by
configure, that managed to find perl anyway.  This time, I ran this on
a Fedora in Docker, without Perl.

* tests/calc.at, tests/diagnostics.at, tests/headers.at,
* tests/input.at, tests/local.at, tests/named-refs.at,
* tests/output.at, tests/regression.at, tests/skeletons.at,
* tests/synclines.at, tests/torture.at: Don't require Perl.
2019-10-11 06:53:45 +02:00
Akim Demaille
9e6c5328d3 diagnostics: also show suggested %empty
* src/reader.c (grammar_rule_check_and_complete): Suggest to add %empty.
* tests/actions.at, tests/diagnostics.at: Adjust expectations.
2019-10-06 12:15:12 +02:00
Akim Demaille
fec13ce2db diagnostics: sort symbols per location
Because the checking of the grammar is made by phases after the whole
grammar was read, we sometimes have diagnostics that look weird.  In
some case, within one type of checking, the entities are not checked
in the order in which they appear in the file.  For instance, checking
symbols is done on the list of symbols sorted by tag:

    foo.y:1.20-22: warning: symbol BAR is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                    ^~~
    foo.y:1.16-18: warning: symbol QUX is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                ^~~

Let's sort them by location instead:

    foo.y:1.16-18: warning: symbol 'QUX' is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                ^~~
    foo.y:1.20-22: warning: symbol 'BAR' is used, but is not defined as a token and has no rules [-Wother]
        1 | %destructor {} QUX BAR
          |                    ^~~

* src/location.h (location_cmp): Be robust to empty file names.
* src/symtab.c (symbol_cmp): Sort by location.
* tests/input.at: Adjust expectations.
2019-10-06 09:54:25 +02:00
Akim Demaille
be3cf406af diagnostics: suggest fixes for undeclared symbols
From

    input.y:1.17-19: warning: symbol baz is used, but is not defined as a token and has no rules [-Wother]
         1 | %printer {} foo baz
           |                 ^~~

to

    input.y:1.17-19: warning: symbol 'baz' is used, but is not defined as a token and has no rules; did you mean 'bar'? [-Wother]
        1 | %printer {} foo baz
          |                 ^~~
          |                 bar

* bootstrap.conf: We need fstrcmp.
* src/symtab.c (symbol_from_uniqstr_fuzzy): New.
(complain_symbol_undeclared): Use it.
* tests/diagnostics.at (Suggestions): New.
* data/bison-default.css (insertion): Rename as...
(fixit-insert): this, as this is what GCC uses.
2019-10-06 09:54:25 +02:00
Akim Demaille
0b585c49ae diagnostics: display suggested update after the caret-info
This commit adds the suggestion in green, on the line below the
caret-and-tildes.

    foo.y:1.1-14: warning: deprecated directive: '%error-verbose', use '%define parse.error verbose' [-Wdeprecated]
        1 | %error-verbose
          | ^~~~~~~~~~~~~~
          | %define parse.error verbose

The current approach, with location_caret_suggestion, is fragile:
there's a protocol of calls to the complain functions which is strict.
We should rather have a richer structure describing the diagnostics,
including with submessages such as the suggestions, passed in the end
to the routines in charge of formatting and printing them.

* src/location.h, src/location.c (location_caret_suggestion): New.
* src/complain.c (deprecated_directive): Use it.
* tests/diagnostics.at, tests/input.at: Adjust expectations.
2019-10-06 08:07:57 +02:00
Akim Demaille
5f45cb05f1 diagnostics: don't print ellipsis on the caret line
From

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHIJKL
      | ...         ^~~~~~~~~~~~~~~~~~~~~~~~~~

to

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHI...
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~

* src/location.c (location_caret): here.
* tests/diagnostics.at: Adjust expectations.
2019-09-22 09:12:08 +02:00
Akim Demaille
b61b0eb9ac diagnostics: also show truncation at the end of line with "..."
From

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHIJKL
      | ...         ^~~~~~~~~~~~~~~~~~~~~~~~~~

to

    9 | ...TUVWXYZ  ABCDEFGHIJKLMNOPQRSTUVWXYZ  ABCDEFGHI...
      | ...         ^~~~~~~~~~~~~~~~~~~~~~~~~~

* src/location.c (location_caret): here.
* tests/diagnostics.at: Adjust expectations.
2019-09-22 09:12:08 +02:00
Akim Demaille
69277e109a diagnostics: check that quoted lines are truncated
* tests/diagnostics.at (Screen width: 60 columns, Screen width: 80
columns, Screen width: 200 columns): New tests.
2019-09-22 09:12:08 +02:00
Akim Demaille
8c18e3f18c api.token.raw: cannot be used with character literals
* src/parse-gram.y (CHAR): api.token.raw and character literals are
mutually exclusive.
* tests/input.at (Character literals and api.token.raw): New.
2019-09-14 10:09:08 +02:00
Akim Demaille
32dff87c1d diagnostics: fix use of complain_indent
* src/symtab.c (symbol_class_set): Here.
* tests/diagnostics.at, tests/input.at, tests/regression.at: Adjust
expectations.
2019-09-14 09:47:49 +02:00
Akim Demaille
19da501e06 input: stop treating lone CRs as end-of-lines
We used to treat lone CRs (\r, aka ^M) as regular NLs (\n), probably
to please Classic MacOS.  As of today, it makes more sense to treat \r
like a plain white space character.

https://lists.gnu.org/archive/html/bison-patches/2019-09/msg00027.html

* src/scan-gram.l (no_cr_read): Remove.  Instead, use...
(eol): this new abbreviation denoting end-of-line.
* src/location.c (caret_getc): New.
(location_caret): Use it.
* tests/diagnostics.at (Carriage return): Adjust expectations.
(CR NL): New.
2019-09-14 09:23:47 +02:00
Akim Demaille
4eed3a0f0c diagnostics: beware of unexpected EOF when quoting the source file
When the input file contains lone CRs (aka, ^M, \r), the locations see
a new line.  Diagnostics look only at \n as end-of-line, so sometimes
there is an offset in diagnostics.  Worse yet: sometimes we loop
endlessly waiting for \n to come from a continuous stream of EOF.

Fix that:
- check for EOF
- beware not to call end_use_class if begin_use_class was not
  called (which would abort).  This could happen if the actual
  line is shorter that the expected one.

Prompted by a (private) report from Marc Schönefeld.

* src/location.c (location_caret): here.
* tests/diagnostics.at (Carriage return): New.
2019-09-12 07:02:46 +02:00
Akim Demaille
378963b139 tests: check token redeclaration
* src/symtab.c (symbol_class_set): Report previous definitions when
redeclared.
* tests/input.at (Symbol redeclared): New.
2019-09-07 17:09:43 +02:00
László Várady
9145bd0b61 diagnostics: fix invalid error message indentation
https://lists.gnu.org/archive/html/bison-patches/2019-08/msg00007.html

When Bison is started with a flag that suppresses warning messages, the
error_message() function can produce a few gigabytes of indentation
because of a dangling pointer.

* src/complain.c (error_message): Don't reset indent_ptr here, but...
(complain_indent): here.
* tests/diagnostics.at (Indentation with message suppression): Check
this case.
2019-08-18 09:40:44 -05:00
Akim Demaille
0269c6fb03 diagnostics: rename --style=debug as --color=debug
It is more consistent with --color=html, --color=test, etc.

* src/getargs.h, src/getargs.c (style_debug): Rename as...
(color_debug): this.
(getargs_colors): Rename --style=debug as --color=debug.
Adjust dependencies.
2019-05-08 13:36:47 +02:00
Akim Demaille
b5233ba323 tests: don't duplicate the portability prologue
* tests/actions.at, tests/input.at: Don't repeat the prologue, skip it.
* tests/diagnostics.at, tests/local.at: Comment changes.
2019-05-03 16:28:28 +02:00
Akim Demaille
57290d63fd package: various fixes for syntax-check
* cfg.mk: Disable checks where needed (e.g., we do want to check the
behavior with tabs).
(sc_at_parser_check): Remove.  Unfortunately since
a11c144609 we no longer use the './'
prefix to run programs in the current directory.  That was so that we
could run Java programs like the other, although they are no run with
the `./` prefix (see 967a59d2c0).
As a consequence this sc check no longer makes sense.
However, since now AT_PARSER_CHECK passes the `./` prefix itself, this
sc-check was superfluous.
* examples/c/reccalc/scan.l: Use memcpy, not strncpy.
* src/ielr.c, src/reader.c: Obfuscate "lr(0)" so that the sc-check for
"space before paren" does not fire.
* tests/diagnostics.at: Avoid space-tab, use tab-tab.
2019-04-28 08:24:31 +02:00
Akim Demaille
386cf25088 diagnostics: give m4 precise locations
Currently we pass only the columns based on the screen-width, which is
important for the carets.  But we don't pass the bytes-based columns,
which is important for the colors.  Pass both.

* src/muscle-tab.c (muscle_boundary_grow): Also pass the byte-based column.
* src/location.c (location_caret): Clarify.
(boundary_set_from_string): Adjust to the new format.
* tests/diagnostics.at (Tabulations and multibyte characters from M4): New.
2019-04-27 18:27:04 +02:00
Akim Demaille
a514c51e55 diagnostics: fix locations coming from M4
Locations issued from M4 need the byte-based column for the
diagnostics to work properly.  Currently they were unassigned, which
typically resulted in partially non-colored diagnostics.

* src/location.c (boundary_set_from_string): Fix the parsed location.
* src/muscle-tab.c (muscle_percent_define_default): Set the byte values.
* tests/diagnostics.at (Locations from M4): New.
2019-04-27 18:12:23 +02:00
Akim Demaille
971e72514f updates: insert/remove %empty
* src/reader.c (grammar_rule_check_and_complete): Generate fixits for
adding/removing %empty.
* tests/actions.at, tests/diagnostics.at, tests/existing.at: Adjust.
2019-04-24 13:21:24 +02:00
Akim Demaille
935d119c82 diagnostics: better rule locations
The "identifier and colon" of a rule is implemented as a single token,
but whose location is only that of the identifier (so that messages
about the lhs of a rule are accurate).  When reducing empty rules, the
default location is the single point location on the end of the
previous symbol.  As a consequence, when Bison parses a grammar, the
location of the right-hand side of an empty rule is based on the
lhs, *independently of the position of the colon*.  And the colon can
be way farther, separated by comments, white spaces, including empty
lines.

As a result, some messages look really bad.  For instance:

    $ cat foo.y
    %%
    foo     : /* empty */
    bar
    : /* empty */

gives

    $ bison -Wall foo.y
    foo.y:2.4: warning: empty rule without %empty [-Wempty-rule]
        2 | foo     : /* empty */
          |    ^
    foo.y:3.4: warning: empty rule without %empty [-Wempty-rule]
        3 | bar
          |    ^

The carets are not at the right column, not even the right line.

This commit passes the colon "again" after the "id colon" token, which
gives more accurate locations for these messages:

    $ bison -Wall foo.y
    foo.y:2.10: warning: empty rule without %empty [-Wempty-rule]
        2 | foo     : /* empty */
          |          ^
    foo.y:4.2: warning: empty rule without %empty [-Wempty-rule]
        4 | : /* empty */
          |  ^

* src/scan-gram.l (SC_AFTER_IDENTIFIER): Rollback the colon, so that
we scan it again afterwards.
(INITIAL): Scan colons.
* src/parse-gram.y (COLON): New.
(rules): Parse the colon after the rule's id_colon (and possible
named reference).
* tests/actions.at, tests/conflicts.at, tests/diagnostics.at,
* tests/existing.at: Adjust.
2019-04-24 13:08:51 +02:00
Akim Demaille
a992a3cb9e diagnostics: don't try to quote special files
Based on a report by Todd Freed.
http://lists.gnu.org/archive/html/bug-bison/2019-04/msg00000.html
See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90034

* src/location.c (caret_info): Also track the file name.
(location_caret): Don't quote special files.
2019-04-23 18:29:10 +02:00
Akim Demaille
a9b350fb3a diagnostics: copy GCC9's format
Currently, when we quote the source file, we indent it with one space,
and preserve tabulations, so there is a discrepancy and the visual
rendering is bad.  One way out is to indent with a tab instead of a
space, but then this space can be used for more information.  This is
what GCC9 does.  Let's play copy cats.

See
https://lists.gnu.org/archive/html/bison-patches/2019-04/msg00025.html
https://developers.redhat.com/blog/2019/03/08/usability-improvements-in-gcc-9/
https://gcc.gnu.org/onlinedocs/gccint/Guidelines-for-Diagnostics.html#Guidelines-for-Diagnostics

* src/location.c (location_caret): Prefix quoted lines with the line
number and a pipe, fitting 8 columns.

* tests/actions.at, tests/c++.at, tests/conflicts.at,
* tests/diagnostics.at, tests/input.at, tests/java.at,
* tests/named-refs.at, tests/reduce.at, tests/regression.at,
* tests/sets.at: Adjust expectations.
Partly by "./build-aux/update-test tests/testsuite.dir/*/testsuite.log"
repeatedly, and partly by hand.
2019-04-23 18:29:10 +02:00
Akim Demaille
afe7dfd3b9 diagnostics: fix the handling of multibyte characters
This is a pity: efforts were invested in computing correctly the
number of screen columns consumed by multibyte characters, but the
routines that do that were fed by single-byte inputs...

As a consequence Bison never displayed correctly locations when there
are multibyte characters.

* src/scan-gram.l (mbchar): New.
Use it instead of . in the catch-all clause.
* tests/diagnostics.at (Tabulations): Enhance into...
(Tabulations and multibyte characters): this.
2019-04-23 18:29:10 +02:00
Akim Demaille
6b6c3de2ae diagnostics: check the handling of tabulations
* tests/diagnostics.at (Tabulations): here.
2019-04-23 18:29:10 +02:00
Akim Demaille
1b70f687fa diagnostics: fix styling issues
Single point locations (equal boundaries) are troublesome, and we were
incorrectly ending the style in their case.  Which results in an abort
in libtextstyle.

There is also a confusion between columns as displayed on the
screen (which take into account multibyte characters and tabulations),
and the number of bytes.  Counting the screen-column
incrementally (character by character) is uneasy (because of multibyte
characters), and I don't want to maintain a buffer of the current line
when displaying the diagnostic.  So I believe the simplest solution is
to track the byte number in addition to the screen column.

* src/location.h, src/location.c (boundary): Add the byte-column.
Adjust dependencies.
* src/getargs.c, src/scan-gram.l: Adjust.
* tests/diagnostics.at: Check zero-width locations.
2019-04-23 18:29:10 +02:00
Akim Demaille
520d474ec6 diagnostics: check the styling
Enable checking of styles even when libtextstyle is not installed.

* src/getargs.h, src/getargs.c (style_debug): New.
(getargs_colors): Set it when --style=debug.
* src/complain.c (begin_use_class, end_use_class): Use it.
* tests/diagnostics.at: New.
2019-04-23 18:29:10 +02:00