doc: updates for 3.6

* doc/bison.texi: More s/token type/token kind/. * NEWS: Update.
2026-06-16 12:42:13 +00:00 · 2020-04-13 19:06:06 +02:00
parent caadfc552b
commit 5d983253f7
3 changed files with 70 additions and 55 deletions
@@ -19,7 +19,7 @@ GNU Bison NEWS
 *** Improved syntax error messages
  Two new values for the %define parse.error variable offer more control to
-  the user.
+  the user.  Available in all the skeletons (C, C++, Java).
 **** %define parse.error detailed
@@ -34,7 +34,12 @@ GNU Bison NEWS
 **** %define parse.error custom
  With this directive, the user forges and emits the syntax error message
-  herself by defining a function such as:
+  herself by defining the yyreport_syntax_error function.  A new type,
  yypcontext_t, captures the circumstances of the error, and provides the
  user with functions to get details, such as yypcontext_expected_tokens to
  get the list of expected token kinds.
  A possible implementation of yyreport_syntax_error is:
    int
    yyreport_syntax_error (const yypcontext_t *ctx)
@@ -86,35 +91,42 @@ GNU Bison NEWS
 *** List of expected tokens (yacc.c)
-  At any point during parsing (including even before submitting the first
+  Push parsers may invoke yypstate_expected_tokens at any point during
-  token), push parsers may now invoke yypstate_expected_tokens to get the
+  parsing (including even before submitting the first token) to get the list
-  list of possible tokens.  This feature can be used to propose
+  of possible tokens.  This feature can be used to propose autocompletion
-  autocompletion (see below the "bistromathic" example).
+  (see below the "bistromathic" example).
  It makes little sense to use this feature without enabling LAC (lookahead
  correction).
 *** Deep overhaul of the symbol and token kinds
-  To avoid the confusion with typing in programming languages, we now refer
+  To avoid the confusion with types in programming languages, we now refer
-  to token and symbol "kinds" instead of token and symbol "types".
+  to token and symbol "kinds" instead of token and symbol "types".  The
  documentation and error messages have been revised.
  All the skeletons have been updated to use dedicated enum types rather
  than integral types.  Special symbols are now regular citizens, instead of
  being declared in ad hoc ways.
 **** Token kinds
  The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER,
-  LPAREN, etc.  Users are invited to replace their uses of "enum
+  LPAREN, etc.  While backward compatibility is of course ensured, users are
-  yytokentype" by "yytoken_kind_t".
+  nonetheless invited to replace their uses of "enum yytokentype" by
  "yytoken_kind_t".
  This type now also includes tokens that were previously hidden: YYEOF (end
  of input), YYUNDEF (undefined token), and YYERRCODE (error token).  They
-  now have string aliases, internationalized if internationalization is
+  now have string aliases, internationalized when internationalization is
  enabled.  Therefore, by default, error messages now refer to "end of file"
-  (internationalized) rather than the cryptic "$end".
+  (internationalized) rather than the cryptic "$end", or to "invaid token"
  rather than "$undefined".
-  In most case, it is now useless to define the end-of-line token as
+  Therefore in most cases it is now useless to define the end-of-line token
-  follows:
+  as follows:
-    %token EOF 0  _("end of file")
+    %token T_EOF 0 "end of file"
  Rather simply use "YYEOF" in your scanner.
@@ -126,7 +138,9 @@ GNU Bison NEWS
  They are now exposed as a enum, "yysymbol_kind_t".
-  This allows users to tailor the error messages the way they want.
+  This allows users to tailor the error messages the way they want, or to
  process some symbols in a specific way in autocompletion (see the
  bistromathic example below).
 *** Modernize display of explanatory statements in diagnostics
@@ -166,12 +180,18 @@ GNU Bison NEWS
  The lexcalc example (a simple example in C based on Flex and Bison) now
  also demonstrates location tracking.
  A new C example, bistromathic, is a fully featured interactive calculator
  using many Bison features: pure interface, push parser, autocompletion
  based on the current parser state (using yypstate_expected_tokens),
  location tracking, internationalized custom error messages, lookahead
  correction, rich debug traces, etc.
  It shows how to depend on the symbol kinds to tailor autocompletion.  For
  instance it recognizes the symbol kind "VARIABLE" to propose
  autocompletion on the existing variables, rather than of the word
  "variable".
 * Noteworthy changes in release 3.5.4 (2020-04-05) [stable]
 ** WARNING: Future backward-incompatibilities!
@@ -19,12 +19,11 @@
 - symbol.type_get should be kind_get, and it's not documented.
 - YYERRCODE and "end of file" and translation
-*** The documentation
+** Java
-You can explicitly specify the numeric code for a token type...
+*** Examples
 Have an example with a push parser.  Use autocompletion in that case.
-The token numbered as 0.
+*** calc.at
 ** Java: calc.at
 Stop hard-coding "Calc".  Adjust local.at (look for FIXME).
 ** doc
@@ -1232,7 +1232,7 @@ action in a GLR parser.
@cindex GLR parsers and @code{yylval}
@vindex yylloc
@cindex GLR parsers and @code{yylloc}
-In any semantic action, you can examine @code{yychar} to determine the type
+In any semantic action, you can examine @code{yychar} to determine the kind
 of the lookahead token present at the time of the associated reduction.
 After checking that @code{yychar} is not set to @code{YYEMPTY} or
@code{YYEOF}, you can then examine @code{yylval} and @code{yylloc} to
@@ -1853,7 +1853,7 @@ for such a single-character token is the character itself.
 The return value of the lexical analyzer function is a numeric code which
 represents a token kind.  The same text used in Bison rules to stand for
-this token kind is also a C expression for the numeric code for the type.
+this token kind is also a C expression for the numeric code of the kind.
 This works in two ways.  If the token kind is a character literal, then its
 numeric code is that of the character; you can use the same character
 literal in the lexical analyzer to express the number.  If the token kind is
@@ -2230,14 +2230,13 @@ the same as the declarations for the infix notation calculator.
@end example
@noindent
-Note there are no declarations specific to locations.  Defining a data
+Note there are no declarations specific to locations.  Defining a data type
-type for storing locations is not needed: we will use the type provided
+for storing locations is not needed: we will use the type provided by
-by default (@pxref{Location Type}), which is a
+default (@pxref{Location Type}), which is a four member structure with the
-four member structure with the following integer fields:
+following integer fields: @code{first_line}, @code{first_column},
-@code{first_line}, @code{first_column}, @code{last_line} and
+@code{last_line} and @code{last_column}.  By conventions, and in accordance
-@code{last_column}.  By conventions, and in accordance with the GNU
+with the GNU Coding Standards and common practice, the line and column count
-Coding Standards and common practice, the line and column count both
+both start at 1.
 start at 1.
@node Ltcalc Rules
@subsection Grammar Rules for @code{ltcalc}
@@ -2646,7 +2645,7 @@ By simply editing the initialization list and adding the necessary include
 files, you can add additional functions to the calculator.
 Two important functions allow look-up and installation of symbols in the
-symbol table.  The function @code{putsym} is passed a name and the type
+symbol table.  The function @code{putsym} is passed a name and the kind
 (@code{VAR} or @code{FUN}) of the object to be installed.  The object is
 linked to the front of the list, and a pointer to the object is returned.
 The function @code{getsym} is passed the name of the symbol to look up.  If
@@ -3698,10 +3697,9 @@ In a simple program it may be sufficient to use the same data type for
 the semantic values of all language constructs.  This was true in the
 RPN and infix calculator examples (@pxref{RPN Calc}).
-Bison normally uses the type @code{int} for semantic values if your
+Bison normally uses the type @code{int} for semantic values if your program
-program uses the same data type for all language constructs.  To
+uses the same data type for all language constructs.  To specify some other
-specify some other type, define the @code{%define} variable
+type, define the @code{%define} variable @code{api.value.type} like this:
@code{api.value.type} like this:
@example
 %define api.value.type @{double@}
@@ -4492,10 +4490,9 @@ Defining a data type for locations is much simpler than for semantic values,
 since all tokens and groupings always use the same type.
 You can specify the type of locations by defining a macro called
-@code{YYLTYPE}, just as you can specify the semantic value type by
+@code{YYLTYPE}, just as you can specify the semantic value type by defining
-defining a @code{YYSTYPE} macro (@pxref{Value Type}).
+a @code{YYSTYPE} macro (@pxref{Value Type}).  When @code{YYLTYPE} is not
-When @code{YYLTYPE} is not defined, Bison uses a default structure type with
+defined, Bison uses a default structure type with four members:
 four members:
@example
 typedef struct YYLTYPE
@@ -7161,7 +7158,7 @@ yylex (void)
    return c;      /* Assume token kind for '+' is '+'. */
  @dots{}
  else
-    return INT;    /* Return the type of the token. */
+    return INT;    /* Return the kind of the token. */
  @dots{}
@}
@end example
@@ -7211,7 +7208,7 @@ the type is @code{int} (the default), you might write this in @code{yylex}:
@group
  @dots{}
  yylval = value;  /* Put value onto Bison stack. */
-  return INT;      /* Return the type of the token. */
+  return INT;      /* Return the kind of the token. */
  @dots{}
@end group
@end example
@@ -7238,7 +7235,7 @@ then the code in @code{yylex} might look like this:
@group
  @dots{}
  yylval.intval = value; /* Put value onto Bison stack. */
-  return INT;            /* Return the type of the token. */
+  return INT;            /* Return the kind of the token. */
  @dots{}
@end group
@end example
@@ -7279,7 +7276,7 @@ yylex (YYSTYPE *lvalp, YYLTYPE *llocp)
@{
  @dots{}
  *lvalp = value;  /* Put value onto Bison stack. */
-  return INT;      /* Return the type of the token. */
+  return INT;      /* Return the kind of the token. */
  @dots{}
@}
@end example
@@ -8383,15 +8380,14 @@ represent the entire sequence of terminal and nonterminal symbols at or
 near the top of the stack.  The current state collects all the information
 about previous input which is relevant to deciding what to do next.
-Each time a lookahead token is read, the current parser state together
+Each time a lookahead token is read, the current parser state together with
-with the type of lookahead token are looked up in a table.  This table
+the kind of lookahead token are looked up in a table.  This table entry can
-entry can say, ``Shift the lookahead token.''  In this case, it also
+say, ``Shift the lookahead token.''  In this case, it also specifies the new
-specifies the new parser state, which is pushed onto the top of the
+parser state, which is pushed onto the top of the parser stack.  Or it can
-parser stack.  Or it can say, ``Reduce using rule number @var{n}.''
+say, ``Reduce using rule number @var{n}.''  This means that a certain number
-This means that a certain number of tokens or groupings are taken off
+of tokens or groupings are taken off the top of the stack, and replaced by
-the top of the stack, and replaced by one grouping.  In other words,
+one grouping.  In other words, that number of states are popped from the
-that number of states are popped from the stack, and one new state is
+stack, and one new state is pushed.
 pushed.
 There is one other alternative: the table can say that the lookahead token
 is erroneous in the current state.  This causes error processing to begin
@@ -11624,8 +11620,8 @@ particular it produces a genuine @code{union}, which have a few specific
 features in C++.
@itemize @minus
@item
-The type @code{YYSTYPE} is defined but its use is discouraged: rather
+The type @code{YYSTYPE} is defined but its use is discouraged: rather you
-you should refer to the parser's encapsulated type
+should refer to the parser's encapsulated type
@code{yy::parser::semantic_type}.
@item
 Non POD (Plain Old Data) types cannot be used.  C++98 forbids any instance