doc: overhaul of the readmes

* README-hacking.md (Working from the Repository): Make it first to
make it easier to find the instructions to build from the repo.
(Implementation Notes): New.
* README: Provide more links.
This commit is contained in:
Akim Demaille
2020-06-28 14:35:55 +02:00
parent e0b0a67b86
commit 160df55c56
3 changed files with 205 additions and 190 deletions

9
README
View File

@@ -17,17 +17,18 @@ GNU Bison's home page is https://gnu.org/software/bison/.
# Installation # Installation
## Build from git ## Build from git
Here are basic installation instructions for a repository checkout: The [README-hacking.md file](README-hacking.md) is about building, modifying
and checking Bison. See its "Working from the Repository" section to build
Bison from the git repo. Roughly, run:
$ git submodule update --init $ git submodule update --init
$ ./bootstrap $ ./bootstrap
then proceed with the usual `configure && make` steps. then proceed with the usual `configure && make` steps.
The file README-hacking.md is about building, modifying and checking Bison.
## Build from tarball ## Build from tarball
See the file INSTALL for generic compilation and installation instructions. See the [INSTALL file](INSTALL] for generic compilation and installation
instructions.
Bison requires GNU m4 1.4.6 or later. See Bison requires GNU m4 1.4.6 or later. See
https://ftp.gnu.org/gnu/m4/m4-1.4.6.tar.gz. https://ftp.gnu.org/gnu/m4/m4-1.4.6.tar.gz.

View File

@@ -5,185 +5,6 @@ Everything related to the development of Bison is on Savannah:
http://savannah.gnu.org/projects/bison/. http://savannah.gnu.org/projects/bison/.
Administrivia
=============
## If you incorporate a change from somebody on the net:
First, if it is a large change, you must make sure they have signed the
appropriate paperwork. Second, be sure to add their name and email address
to THANKS.
## If a change fixes a test, mention the test in the commit message.
## Bug reports
If somebody reports a new bug, mention his name in the commit message and in
the test case you write. Put him into THANKS.
The correct response to most actual bugs is to write a new test case which
demonstrates the bug. Then fix the bug, re-run the test suite, and check
everything in.
Hacking
=======
## Visible Changes
Which include serious bug fixes, must be mentioned in NEWS.
## Translations
Only user visible strings are to be translated: error messages, bits of the
.output file etc. This excludes impossible error messages (comparable to
assert/abort), and all the --trace output which is meant for the maintainers
only.
## Vocabulary
- "nonterminal", not "variable" or "non-terminal" or "non terminal".
Abbreviated as "nterm".
- "shift/reduce" and "reduce/reduce", not "shift-reduce" or "shift reduce",
etc.
## Syntax highlighting
It's quite nice to be in C++ mode when editing lalr1.cc for instance.
However tools such as Emacs will be fooled by the fact that braces and
parens do not nest, as in `[[}]]`. As a consequence you might be misguided
by its visual pairing to parens. The m4-mode is safer. Unfortunately the
m4-mode is also fooled by `#` which is sees as a comment, stops pairing with
parens/brackets that are inside...
## Coding Style
Do not add horizontal tab characters to any file in Bison's repository
except where required. For example, do not use tabs to format C code.
However, make files, ChangeLog, and some regular expressions require tabs.
Also, test cases might need to contain tabs to check that Bison properly
processes tabs in its input.
Prefer "res" as the name of the local variable that will be "return"ed by
the function.
### Bison
Follow the GNU Coding Standards.
Don't reinvent the wheel: we use gnulib, which features many components.
Actually, Bison has legacy code that we should replace with gnulib modules
(e.g., many ad hoc implementations of lists).
#### Includes
The `#include` directives follow an order:
- first section for *.c files is `<config.h>`. Don't include it in header
files
- then, for *.c files, the corresponding *.h file
- then possibly the `"system.h"` header
- then the system headers.
Consider headers from `lib/` like system headers (i.e., `#include
<verify.h>`, not `#include "verify.h"`).
- then headers from src/ with double quotes (`#include "getargs.h"`).
Keep headers sorted alphabetically in each section.
See also the [Header
files](https://www.gnu.org/software/gnulib/manual/html_node/Header-files.html)
and the [Implementation
files](https://www.gnu.org/software/gnulib/manual/html_node/Implementation-files.html#Implementation-files)
nodes of the gnulib documentation.
Some source files are in the build tree (e.g., `src/scan-gram.c` made from
`src/scan-gram.l`). For them to find the headers from `src/`, we actually
use `#include "src/getargs.h"` instead of `#include "getargs.h"`---that
saves us from additional `-I` flags.
### Skeletons
We try to use the "typical" coding style for each language.
#### CPP
We indent the CPP directives this way:
```
#if FOO
# if BAR
# define BAZ
# endif
#endif
```
Don't indent with leading spaces in the skeletons (it's OK in the grammar
files though, e.g., in `%code {...}` blocks).
On occasions, use `cppi -c` to see where we stand. We don't aim at full
correctness: depending `-d`, some bits can be in the *.c file, or the *.h
file within the double-inclusion cpp-guards. In that case, favor the case
of the *.h file, but don't waste time on this.
Don't hesitate to leave a comment on the `#endif` (e.g., `#endif /* FOO
*/`), especially for long blocks.
There is no consistency on `! defined` vs. `!defined`. The day gnulib
decides, we'll follow them.
#### C/C++
Follow the GNU Coding Standards.
The `glr.c` skeleton was implemented with `camlCase`. We are migrating it
to `snake_case`. Because we are standardizing the code, it is currently
inconsistent.
Use `YYFOO` and `yyfoo` for entities that are exposed to the user. They are
part of our contract with the users wrt backward compatibility.
Use `YY_FOO` and `yy_foo` for private matters. Users should not use them,
we are free to change them without fear of backward compatibility issues.
Use `*_t` for types, especially for `yy*_t` in which case we shouldn't worry
about the C standard introducing such a name.
#### C++
Follow the C++ Core Guidelines
(http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). The Google
ones may be interesting too
(https://google.github.io/styleguide/cppguide.html).
Our enumerators, such as the kinds (symbol and token kinds), should be lower
case, but it was too late to follow that track for token kinds, and symbol
kind enumerators are made to be consistent with them.
Use `*_type` for type aliases. Use `foo_get()` and `foo_set(v)` for
accessors, or simply `foo()` and `foo(v)`.
Use the `yy` prefix for private stuff, but there's no need for it in the
public API. The `yy` prefix is already taken care of via the namespace.
#### Java
We follow https://www.oracle.com/technetwork/java/codeconventions-150003.pdf
and https://google.github.io/styleguide/javaguide.html. Unfortunately at
some point some GNU Coding Style was installed in Java, but it's an error.
So we should for instance stop putting spaces in function calls. Because we
are standardizing the code, it is currently inconsistent.
Use a 2-space indentation (Google) rather than 4 (Oracle).
Don't use the "yy" prefix for public members: "getExpectedTokens", not
"yyexpectedTokens" or "yygetExpectedTokens".
## Commit Messages
Imitate the style we use. Use `git log` to get sources of inspiration.
If the changes have a small impact on Bison's generated parser, embed these
changes in the commit itself. If the impact is large, first push all the
changes except those about src/parse-gram.[ch], and then another commit
named "regen" which is only about them.
## Debugging
Bison supports tracing of its various steps, via the `--trace` option.
Since it is not meant for the end user, it is not displayed by `bison
--help`, nor is it documented in the manual. Instead, run `bison
--trace=help`.
## Documentation
Use `@option` for options and options with their argument if they have no
space (e.g., `@option{-Dfoo=bar}`). However, use `@samp` elsewhere (e.g.,
`@samp{-I foo}`).
Working from the Repository Working from the Repository
=========================== ===========================
@@ -357,6 +178,196 @@ version, compile bison, then force it to recreate the files:
$ make -C _build $ make -C _build
Administrivia
=============
## If you incorporate a change from somebody on the net:
First, if it is a large change, you must make sure they have signed the
appropriate paperwork. Second, be sure to add their name and email address
to THANKS.
## If a change fixes a test, mention the test in the commit message.
## Bug reports
If somebody reports a new bug, mention his name in the commit message and in
the test case you write. Put him into THANKS.
The correct response to most actual bugs is to write a new test case which
demonstrates the bug. Then fix the bug, re-run the test suite, and check
everything in.
Hacking
=======
## Visible Changes
Which include serious bug fixes, must be mentioned in NEWS.
## Translations
Only user visible strings are to be translated: error messages, bits of the
.output file etc. This excludes impossible error messages (comparable to
assert/abort), and all the --trace output which is meant for the maintainers
only.
## Vocabulary
- "nonterminal", not "variable" or "non-terminal" or "non terminal".
Abbreviated as "nterm".
- "shift/reduce" and "reduce/reduce", not "shift-reduce" or "shift reduce",
etc.
## Syntax Highlighting
It's quite nice to be in C++ mode when editing lalr1.cc for instance.
However tools such as Emacs will be fooled by the fact that braces and
parens do not nest, as in `[[}]]`. As a consequence you might be misguided
by its visual pairing to parens. The m4-mode is safer. Unfortunately the
m4-mode is also fooled by `#` which is sees as a comment, stops pairing with
parens/brackets that are inside...
## Implementation Notes
There are several places with interesting details about the implementation:
- [Understanding C parsers generated by GNU
Bison](https://www.cs.uic.edu/~spopuri/cparser.html) by Satya Kiran Popuri,
is a wonderful piece of work that explains the implementation of Bison,
- [src/gram.h](src/gram.h) documents the way the grammar is represented
- [src/tables.h](src/tables.h) documents the generated tables
- [data/README.md](data/README.md) contains details about the m4 implementation
## Coding Style
Do not add horizontal tab characters to any file in Bison's repository
except where required. For example, do not use tabs to format C code.
However, make files, ChangeLog, and some regular expressions require tabs.
Also, test cases might need to contain tabs to check that Bison properly
processes tabs in its input.
Prefer `res` as the name of the local variable that will be "return"ed by
the function.
### Bison
Follow the GNU Coding Standards.
Don't reinvent the wheel: we use gnulib, which features many components.
Actually, Bison has legacy code that we should replace with gnulib modules
(e.g., many ad hoc implementations of lists).
#### Includes
The `#include` directives follow an order:
- first section for *.c files is `<config.h>`. Don't include it in header
files
- then, for *.c files, the corresponding *.h file
- then possibly the `"system.h"` header
- then the system headers.
Consider headers from `lib/` like system headers (i.e., `#include
<verify.h>`, not `#include "verify.h"`).
- then headers from src/ with double quotes (`#include "getargs.h"`).
Keep headers sorted alphabetically in each section.
See also the [Header
files](https://www.gnu.org/software/gnulib/manual/html_node/Header-files.html)
and the [Implementation
files](https://www.gnu.org/software/gnulib/manual/html_node/Implementation-files.html#Implementation-files)
nodes of the gnulib documentation.
Some source files are in the build tree (e.g., `src/scan-gram.c` made from
`src/scan-gram.l`). For them to find the headers from `src/`, we actually
use `#include "src/getargs.h"` instead of `#include "getargs.h"`---that
saves us from additional `-I` flags.
### Skeletons
We try to use the "typical" coding style for each language.
#### CPP
We indent the CPP directives this way:
```
#if FOO
# if BAR
# define BAZ
# endif
#endif
```
Don't indent with leading spaces in the skeletons (it's OK in the grammar
files though, e.g., in `%code {...}` blocks).
On occasions, use `cppi -c` to see where we stand. We don't aim at full
correctness: depending `-d`, some bits can be in the *.c file, or the *.h
file within the double-inclusion cpp-guards. In that case, favor the case
of the *.h file, but don't waste time on this.
Don't hesitate to leave a comment on the `#endif` (e.g., `#endif /* FOO
*/`), especially for long blocks.
There is no consistency on `! defined` vs. `!defined`. The day gnulib
decides, we'll follow them.
#### C/C++
Follow the GNU Coding Standards.
The `glr.c` skeleton was implemented with `camlCase`. We are migrating it
to `snake_case`. Because we are gradually standardizing the code, it is
currently inconsistent.
Use `YYFOO` and `yyfoo` for entities that are exposed to the user. They are
part of our contract with the users wrt backward compatibility.
Use `YY_FOO` and `yy_foo` for private matters. Users should not use them,
we are free to change them without fear of backward compatibility issues.
Use `*_t` for types, especially for `yy*_t` in which case we shouldn't worry
about the C standard introducing such a name.
#### C++
Follow the [C++ Core
Guidelines](http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines).
The [Google ones](https://google.github.io/styleguide/cppguide.html) may be
interesting too.
Our enumerators, such as the kinds (symbol and token kinds), should be lower
case, but it was too late to follow that track for token kinds, and symbol
kind enumerators are made to be consistent with them.
Use `*_type` for type aliases. Use `foo_get()` and `foo_set(v)` for
accessors, or simply `foo()` and `foo(v)`.
Use the `yy` prefix for private stuff, but there's no need for it in the
public API. The `yy` prefix is already taken care of via the namespace.
#### Java
We follow the [Java Code
Conventions](https://www.oracle.com/technetwork/java/codeconventions-150003.pdf)
and [Google Java Style
Guide](https://google.github.io/styleguide/javaguide.html). Unfortunately
at some point some GNU Coding Style was installed in Java, but it's an
error. So we should for instance stop putting spaces in function calls.
Because we are standardizing the code, it is currently inconsistent.
Use a 2-space indentation (Google) rather than 4 (Oracle).
Don't use the "yy" prefix for public members: "getExpectedTokens", not
"yyexpectedTokens" or "yygetExpectedTokens".
## Commit Messages
Imitate the style we use. Use `git log` to get sources of inspiration.
If the changes have a small impact on Bison's generated parser, embed these
changes in the commit itself. If the impact is large, first push all the
changes except those about src/parse-gram.[ch], and then another commit
named "regen" which is only about them.
## Debugging
Bison supports tracing of its various steps, via the `--trace` option.
Since it is not meant for the end user, it is not displayed by `bison
--help`, nor is it documented in the manual. Instead, run `bison
--trace=help`.
## Documentation
Use `@option` for options and options with their argument if they have no
space (e.g., `@option{-Dfoo=bar}`). However, use `@samp` elsewhere (e.g.,
`@samp{-I foo}`).
Test Suite Test Suite
========== ==========
@@ -366,9 +377,9 @@ examples, and the main test suite.
### The Examples ### The Examples
In examples/, there is a number of ready-to-use examples (see In examples/, there is a number of ready-to-use examples (see
examples/README.md). These examples have small test suites run by `make [examples/README.md](examples/README.md)). These examples have small test
check`. The test results are in local `*.log` files (e.g., suites run by `make check`. The test results are in local `*.log` files
`$build/examples/c/calc/calc.log`). (e.g., `$build/examples/c/calc/calc.log`).
### The Main Test Suite ### The Main Test Suite
The main test suite, in tests/, is written on top of GNU Autotest, which is The main test suite, in tests/, is written on top of GNU Autotest, which is
@@ -548,7 +559,8 @@ re-run the tests, run:
Release Procedure Release Procedure
================= =================
See README-release. See the [README-release file](README-release), created when the package is
bootstrapped.
<!-- <!--

View File

@@ -1,6 +1,6 @@
This directory contains data needed by Bison. This directory contains data needed by Bison.
# Directory content # Directory Content
## Skeletons ## Skeletons
Bison skeletons: the general shapes of the different parser kinds, that are Bison skeletons: the general shapes of the different parser kinds, that are
specialized for specific grammars by the bison program. specialized for specific grammars by the bison program.
@@ -48,7 +48,7 @@ various formats.
- xml2xhtml.xsl - xml2xhtml.xsl
Conversion into XHTML. Conversion into XHTML.
# Implementation note about the skeletons # Implementation Notes About the Skeletons
"Skeleton" in Bison parlance means "backend": a skeleton is fed by the bison "Skeleton" in Bison parlance means "backend": a skeleton is fed by the bison
executable with LR tables, facts about the symbols, etc. and they generate executable with LR tables, facts about the symbols, etc. and they generate
@@ -179,7 +179,7 @@ The data corresponding to the symbol `#POS`, where the current rule has
Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols
on RHS. on RHS.
----- <!--
Local Variables: Local Variables:
mode: markdown mode: markdown
@@ -203,3 +203,5 @@ GNU General Public License for more details.
You should have received a copy of the GNU General Public License You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. along with this program. If not, see <http://www.gnu.org/licenses/>.
-->