doc: overhaul of the readmes

* README-hacking.md (Working from the Repository): Make it first to make it easier to find the instructions to build from the repo. (Implementation Notes): New. * README: Provide more links.
2026-04-23 18:19:38 +00:00 · 2020-06-28 14:35:55 +02:00
parent e0b0a67b86
commit 160df55c56
3 changed files with 205 additions and 190 deletions
@@ -5,185 +5,6 @@ Everything related to the development of Bison is on Savannah:
 http://savannah.gnu.org/projects/bison/.


-Administrivia
-=============
-
-## If you incorporate a change from somebody on the net:
-First, if it is a large change, you must make sure they have signed the
-appropriate paperwork.  Second, be sure to add their name and email address
-to THANKS.
-
-## If a change fixes a test, mention the test in the commit message.
-
-## Bug reports
-If somebody reports a new bug, mention his name in the commit message and in
-the test case you write.  Put him into THANKS.
-
-The correct response to most actual bugs is to write a new test case which
-demonstrates the bug.  Then fix the bug, re-run the test suite, and check
-everything in.
-
-
-
-Hacking
-=======
-
-## Visible Changes
-Which include serious bug fixes, must be mentioned in NEWS.
-
-## Translations
-Only user visible strings are to be translated: error messages, bits of the
-.output file etc.  This excludes impossible error messages (comparable to
-assert/abort), and all the --trace output which is meant for the maintainers
-only.
-
-## Vocabulary
- "nonterminal", not "variable" or "non-terminal" or "non terminal".
-  Abbreviated as "nterm".
- "shift/reduce" and "reduce/reduce", not "shift-reduce" or "shift reduce",
-  etc.
-
-## Syntax highlighting
-It's quite nice to be in C++ mode when editing lalr1.cc for instance.
-However tools such as Emacs will be fooled by the fact that braces and
-parens do not nest, as in `[[}]]`.  As a consequence you might be misguided
-by its visual pairing to parens.  The m4-mode is safer.  Unfortunately the
-m4-mode is also fooled by `#` which is sees as a comment, stops pairing with
-parens/brackets that are inside...
-
-## Coding Style
-Do not add horizontal tab characters to any file in Bison's repository
-except where required.  For example, do not use tabs to format C code.
-However, make files, ChangeLog, and some regular expressions require tabs.
-Also, test cases might need to contain tabs to check that Bison properly
-processes tabs in its input.
-
-Prefer "res" as the name of the local variable that will be "return"ed by
-the function.
-
-### Bison
-Follow the GNU Coding Standards.
-
-Don't reinvent the wheel: we use gnulib, which features many components.
-Actually, Bison has legacy code that we should replace with gnulib modules
-(e.g., many ad hoc implementations of lists).
-
-#### Includes
-The `#include` directives follow an order:
- first section for *.c files is `<config.h>`.  Don't include it in header
-  files
- then, for *.c files, the corresponding *.h file
- then possibly the `"system.h"` header
- then the system headers.
-  Consider headers from `lib/` like system headers (i.e., `#include
-  <verify.h>`, not `#include "verify.h"`).
- then headers from src/ with double quotes (`#include "getargs.h"`).
-
-Keep headers sorted alphabetically in each section.
-
-See also the [Header
-files](https://www.gnu.org/software/gnulib/manual/html_node/Header-files.html)
-and the [Implementation
-files](https://www.gnu.org/software/gnulib/manual/html_node/Implementation-files.html#Implementation-files)
-nodes of the gnulib documentation.
-
-Some source files are in the build tree (e.g., `src/scan-gram.c` made from
-`src/scan-gram.l`).  For them to find the headers from `src/`, we actually
-use `#include "src/getargs.h"` instead of `#include "getargs.h"`---that
-saves us from additional `-I` flags.
-
-### Skeletons
-We try to use the "typical" coding style for each language.
-
-#### CPP
-We indent the CPP directives this way:
-
-```
-#if FOO
-# if BAR
-#  define BAZ
-# endif
-#endif
-```
-
-Don't indent with leading spaces in the skeletons (it's OK in the grammar
-files though, e.g., in `%code {...}` blocks).
-
-On occasions, use `cppi -c` to see where we stand.  We don't aim at full
-correctness: depending `-d`, some bits can be in the *.c file, or the *.h
-file within the double-inclusion cpp-guards.  In that case, favor the case
-of the *.h file, but don't waste time on this.
-
-Don't hesitate to leave a comment on the `#endif` (e.g., `#endif /* FOO
-*/`), especially for long blocks.
-
-There is no consistency on `! defined` vs. `!defined`.  The day gnulib
-decides, we'll follow them.
-
-#### C/C++
-Follow the GNU Coding Standards.
-
-The `glr.c` skeleton was implemented with `camlCase`.  We are migrating it
-to `snake_case`.  Because we are standardizing the code, it is currently
-inconsistent.
-
-Use `YYFOO` and `yyfoo` for entities that are exposed to the user.  They are
-part of our contract with the users wrt backward compatibility.
-
-Use `YY_FOO` and `yy_foo` for private matters.  Users should not use them,
-we are free to change them without fear of backward compatibility issues.
-
-Use `*_t` for types, especially for `yy*_t` in which case we shouldn't worry
-about the C standard introducing such a name.
-
-#### C++
-Follow the C++ Core Guidelines
-(http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines).  The Google
-ones may be interesting too
-(https://google.github.io/styleguide/cppguide.html).
-
-Our enumerators, such as the kinds (symbol and token kinds), should be lower
-case, but it was too late to follow that track for token kinds, and symbol
-kind enumerators are made to be consistent with them.
-
-Use `*_type` for type aliases.  Use `foo_get()` and `foo_set(v)` for
-accessors, or simply `foo()` and `foo(v)`.
-
-Use the `yy` prefix for private stuff, but there's no need for it in the
-public API.  The `yy` prefix is already taken care of via the namespace.
-
-#### Java
-We follow https://www.oracle.com/technetwork/java/codeconventions-150003.pdf
-and https://google.github.io/styleguide/javaguide.html.  Unfortunately at
-some point some GNU Coding Style was installed in Java, but it's an error.
-So we should for instance stop putting spaces in function calls.  Because we
-are standardizing the code, it is currently inconsistent.
-
-Use a 2-space indentation (Google) rather than 4 (Oracle).
-
-Don't use the "yy" prefix for public members: "getExpectedTokens", not
-"yyexpectedTokens" or "yygetExpectedTokens".
-
-## Commit Messages
-Imitate the style we use.  Use `git log` to get sources of inspiration.
-
-If the changes have a small impact on Bison's generated parser, embed these
-changes in the commit itself.  If the impact is large, first push all the
-changes except those about src/parse-gram.[ch], and then another commit
-named "regen" which is only about them.
-
-## Debugging
-Bison supports tracing of its various steps, via the `--trace` option.
-Since it is not meant for the end user, it is not displayed by `bison
--help`, nor is it documented in the manual.  Instead, run `bison
--trace=help`.
-
-## Documentation
-Use `@option` for options and options with their argument if they have no
-space (e.g., `@option{-Dfoo=bar}`).  However, use `@samp` elsewhere (e.g.,
-`@samp{-I foo}`).
-
-
 Working from the Repository
 ===========================

@@ -357,6 +178,196 @@ version, compile bison, then force it to recreate the files:
    $ make -C _build


+Administrivia
+=============
+
+## If you incorporate a change from somebody on the net:
+First, if it is a large change, you must make sure they have signed the
+appropriate paperwork.  Second, be sure to add their name and email address
+to THANKS.
+
+## If a change fixes a test, mention the test in the commit message.
+
+## Bug reports
+If somebody reports a new bug, mention his name in the commit message and in
+the test case you write.  Put him into THANKS.
+
+The correct response to most actual bugs is to write a new test case which
+demonstrates the bug.  Then fix the bug, re-run the test suite, and check
+everything in.
+
+
+
+Hacking
+=======
+
+## Visible Changes
+Which include serious bug fixes, must be mentioned in NEWS.
+
+## Translations
+Only user visible strings are to be translated: error messages, bits of the
+.output file etc.  This excludes impossible error messages (comparable to
+assert/abort), and all the --trace output which is meant for the maintainers
+only.
+
+## Vocabulary
+- "nonterminal", not "variable" or "non-terminal" or "non terminal".
+  Abbreviated as "nterm".
+- "shift/reduce" and "reduce/reduce", not "shift-reduce" or "shift reduce",
+  etc.
+
+## Syntax Highlighting
+It's quite nice to be in C++ mode when editing lalr1.cc for instance.
+However tools such as Emacs will be fooled by the fact that braces and
+parens do not nest, as in `[[}]]`.  As a consequence you might be misguided
+by its visual pairing to parens.  The m4-mode is safer.  Unfortunately the
+m4-mode is also fooled by `#` which is sees as a comment, stops pairing with
+parens/brackets that are inside...
+
+## Implementation Notes
+There are several places with interesting details about the implementation:
+- [Understanding C parsers generated by GNU
+Bison](https://www.cs.uic.edu/~spopuri/cparser.html) by Satya Kiran Popuri,
+is a wonderful piece of work that explains the implementation of Bison,
+- [src/gram.h](src/gram.h) documents the way the grammar is represented
+- [src/tables.h](src/tables.h) documents the generated tables
+- [data/README.md](data/README.md) contains details about the m4 implementation
+
+## Coding Style
+Do not add horizontal tab characters to any file in Bison's repository
+except where required.  For example, do not use tabs to format C code.
+However, make files, ChangeLog, and some regular expressions require tabs.
+Also, test cases might need to contain tabs to check that Bison properly
+processes tabs in its input.
+
+Prefer `res` as the name of the local variable that will be "return"ed by
+the function.
+
+### Bison
+Follow the GNU Coding Standards.
+
+Don't reinvent the wheel: we use gnulib, which features many components.
+Actually, Bison has legacy code that we should replace with gnulib modules
+(e.g., many ad hoc implementations of lists).
+
+#### Includes
+The `#include` directives follow an order:
+- first section for *.c files is `<config.h>`.  Don't include it in header
+  files
+- then, for *.c files, the corresponding *.h file
+- then possibly the `"system.h"` header
+- then the system headers.
+  Consider headers from `lib/` like system headers (i.e., `#include
+  <verify.h>`, not `#include "verify.h"`).
+- then headers from src/ with double quotes (`#include "getargs.h"`).
+
+Keep headers sorted alphabetically in each section.
+
+See also the [Header
+files](https://www.gnu.org/software/gnulib/manual/html_node/Header-files.html)
+and the [Implementation
+files](https://www.gnu.org/software/gnulib/manual/html_node/Implementation-files.html#Implementation-files)
+nodes of the gnulib documentation.
+
+Some source files are in the build tree (e.g., `src/scan-gram.c` made from
+`src/scan-gram.l`).  For them to find the headers from `src/`, we actually
+use `#include "src/getargs.h"` instead of `#include "getargs.h"`---that
+saves us from additional `-I` flags.
+
+### Skeletons
+We try to use the "typical" coding style for each language.
+
+#### CPP
+We indent the CPP directives this way:
+
+```
+#if FOO
+# if BAR
+#  define BAZ
+# endif
+#endif
+```
+
+Don't indent with leading spaces in the skeletons (it's OK in the grammar
+files though, e.g., in `%code {...}` blocks).
+
+On occasions, use `cppi -c` to see where we stand.  We don't aim at full
+correctness: depending `-d`, some bits can be in the *.c file, or the *.h
+file within the double-inclusion cpp-guards.  In that case, favor the case
+of the *.h file, but don't waste time on this.
+
+Don't hesitate to leave a comment on the `#endif` (e.g., `#endif /* FOO
+*/`), especially for long blocks.
+
+There is no consistency on `! defined` vs. `!defined`.  The day gnulib
+decides, we'll follow them.
+
+#### C/C++
+Follow the GNU Coding Standards.
+
+The `glr.c` skeleton was implemented with `camlCase`.  We are migrating it
+to `snake_case`.  Because we are gradually standardizing the code, it is
+currently inconsistent.
+
+Use `YYFOO` and `yyfoo` for entities that are exposed to the user.  They are
+part of our contract with the users wrt backward compatibility.
+
+Use `YY_FOO` and `yy_foo` for private matters.  Users should not use them,
+we are free to change them without fear of backward compatibility issues.
+
+Use `*_t` for types, especially for `yy*_t` in which case we shouldn't worry
+about the C standard introducing such a name.
+
+#### C++
+Follow the [C++ Core
+Guidelines](http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines).
+The [Google ones](https://google.github.io/styleguide/cppguide.html) may be
+interesting too.
+
+Our enumerators, such as the kinds (symbol and token kinds), should be lower
+case, but it was too late to follow that track for token kinds, and symbol
+kind enumerators are made to be consistent with them.
+
+Use `*_type` for type aliases.  Use `foo_get()` and `foo_set(v)` for
+accessors, or simply `foo()` and `foo(v)`.
+
+Use the `yy` prefix for private stuff, but there's no need for it in the
+public API.  The `yy` prefix is already taken care of via the namespace.
+
+#### Java
+We follow the [Java Code
+Conventions](https://www.oracle.com/technetwork/java/codeconventions-150003.pdf)
+and [Google Java Style
+Guide](https://google.github.io/styleguide/javaguide.html).  Unfortunately
+at some point some GNU Coding Style was installed in Java, but it's an
+error.  So we should for instance stop putting spaces in function calls.
+Because we are standardizing the code, it is currently inconsistent.
+
+Use a 2-space indentation (Google) rather than 4 (Oracle).
+
+Don't use the "yy" prefix for public members: "getExpectedTokens", not
+"yyexpectedTokens" or "yygetExpectedTokens".
+
+## Commit Messages
+Imitate the style we use.  Use `git log` to get sources of inspiration.
+
+If the changes have a small impact on Bison's generated parser, embed these
+changes in the commit itself.  If the impact is large, first push all the
+changes except those about src/parse-gram.[ch], and then another commit
+named "regen" which is only about them.
+
+## Debugging
+Bison supports tracing of its various steps, via the `--trace` option.
+Since it is not meant for the end user, it is not displayed by `bison
+--help`, nor is it documented in the manual.  Instead, run `bison
+--trace=help`.
+
+## Documentation
+Use `@option` for options and options with their argument if they have no
+space (e.g., `@option{-Dfoo=bar}`).  However, use `@samp` elsewhere (e.g.,
+`@samp{-I foo}`).
+
+
 Test Suite
 ==========

@@ -366,9 +377,9 @@ examples, and the main test suite.

 ### The Examples
 In examples/, there is a number of ready-to-use examples (see
-examples/README.md).  These examples have small test suites run by `make
-check`.  The test results are in local `*.log` files (e.g.,
-`$build/examples/c/calc/calc.log`).
+[examples/README.md](examples/README.md)).  These examples have small test
+suites run by `make check`.  The test results are in local `*.log` files
+(e.g., `$build/examples/c/calc/calc.log`).

 ### The Main Test Suite
 The main test suite, in tests/, is written on top of GNU Autotest, which is
@@ -548,7 +559,8 @@ re-run the tests, run:
 Release Procedure
 =================

-See README-release.
+See the [README-release file](README-release), created when the package is
+bootstrapped.

 <!--