mirror of
https://git.savannah.gnu.org/git/bison.git
synced 2026-03-09 04:13:03 +00:00
doc: overhaul of the readmes
* README-hacking.md (Working from the Repository): Make it first to make it easier to find the instructions to build from the repo. (Implementation Notes): New. * README: Provide more links.
This commit is contained in:
9
README
9
README
@@ -17,17 +17,18 @@ GNU Bison's home page is https://gnu.org/software/bison/.
|
||||
|
||||
# Installation
|
||||
## Build from git
|
||||
Here are basic installation instructions for a repository checkout:
|
||||
The [README-hacking.md file](README-hacking.md) is about building, modifying
|
||||
and checking Bison. See its "Working from the Repository" section to build
|
||||
Bison from the git repo. Roughly, run:
|
||||
|
||||
$ git submodule update --init
|
||||
$ ./bootstrap
|
||||
|
||||
then proceed with the usual `configure && make` steps.
|
||||
|
||||
The file README-hacking.md is about building, modifying and checking Bison.
|
||||
|
||||
## Build from tarball
|
||||
See the file INSTALL for generic compilation and installation instructions.
|
||||
See the [INSTALL file](INSTALL] for generic compilation and installation
|
||||
instructions.
|
||||
|
||||
Bison requires GNU m4 1.4.6 or later. See
|
||||
https://ftp.gnu.org/gnu/m4/m4-1.4.6.tar.gz.
|
||||
|
||||
@@ -5,185 +5,6 @@ Everything related to the development of Bison is on Savannah:
|
||||
http://savannah.gnu.org/projects/bison/.
|
||||
|
||||
|
||||
Administrivia
|
||||
=============
|
||||
|
||||
## If you incorporate a change from somebody on the net:
|
||||
First, if it is a large change, you must make sure they have signed the
|
||||
appropriate paperwork. Second, be sure to add their name and email address
|
||||
to THANKS.
|
||||
|
||||
## If a change fixes a test, mention the test in the commit message.
|
||||
|
||||
## Bug reports
|
||||
If somebody reports a new bug, mention his name in the commit message and in
|
||||
the test case you write. Put him into THANKS.
|
||||
|
||||
The correct response to most actual bugs is to write a new test case which
|
||||
demonstrates the bug. Then fix the bug, re-run the test suite, and check
|
||||
everything in.
|
||||
|
||||
|
||||
|
||||
Hacking
|
||||
=======
|
||||
|
||||
## Visible Changes
|
||||
Which include serious bug fixes, must be mentioned in NEWS.
|
||||
|
||||
## Translations
|
||||
Only user visible strings are to be translated: error messages, bits of the
|
||||
.output file etc. This excludes impossible error messages (comparable to
|
||||
assert/abort), and all the --trace output which is meant for the maintainers
|
||||
only.
|
||||
|
||||
## Vocabulary
|
||||
- "nonterminal", not "variable" or "non-terminal" or "non terminal".
|
||||
Abbreviated as "nterm".
|
||||
- "shift/reduce" and "reduce/reduce", not "shift-reduce" or "shift reduce",
|
||||
etc.
|
||||
|
||||
## Syntax highlighting
|
||||
It's quite nice to be in C++ mode when editing lalr1.cc for instance.
|
||||
However tools such as Emacs will be fooled by the fact that braces and
|
||||
parens do not nest, as in `[[}]]`. As a consequence you might be misguided
|
||||
by its visual pairing to parens. The m4-mode is safer. Unfortunately the
|
||||
m4-mode is also fooled by `#` which is sees as a comment, stops pairing with
|
||||
parens/brackets that are inside...
|
||||
|
||||
## Coding Style
|
||||
Do not add horizontal tab characters to any file in Bison's repository
|
||||
except where required. For example, do not use tabs to format C code.
|
||||
However, make files, ChangeLog, and some regular expressions require tabs.
|
||||
Also, test cases might need to contain tabs to check that Bison properly
|
||||
processes tabs in its input.
|
||||
|
||||
Prefer "res" as the name of the local variable that will be "return"ed by
|
||||
the function.
|
||||
|
||||
### Bison
|
||||
Follow the GNU Coding Standards.
|
||||
|
||||
Don't reinvent the wheel: we use gnulib, which features many components.
|
||||
Actually, Bison has legacy code that we should replace with gnulib modules
|
||||
(e.g., many ad hoc implementations of lists).
|
||||
|
||||
#### Includes
|
||||
The `#include` directives follow an order:
|
||||
- first section for *.c files is `<config.h>`. Don't include it in header
|
||||
files
|
||||
- then, for *.c files, the corresponding *.h file
|
||||
- then possibly the `"system.h"` header
|
||||
- then the system headers.
|
||||
Consider headers from `lib/` like system headers (i.e., `#include
|
||||
<verify.h>`, not `#include "verify.h"`).
|
||||
- then headers from src/ with double quotes (`#include "getargs.h"`).
|
||||
|
||||
Keep headers sorted alphabetically in each section.
|
||||
|
||||
See also the [Header
|
||||
files](https://www.gnu.org/software/gnulib/manual/html_node/Header-files.html)
|
||||
and the [Implementation
|
||||
files](https://www.gnu.org/software/gnulib/manual/html_node/Implementation-files.html#Implementation-files)
|
||||
nodes of the gnulib documentation.
|
||||
|
||||
Some source files are in the build tree (e.g., `src/scan-gram.c` made from
|
||||
`src/scan-gram.l`). For them to find the headers from `src/`, we actually
|
||||
use `#include "src/getargs.h"` instead of `#include "getargs.h"`---that
|
||||
saves us from additional `-I` flags.
|
||||
|
||||
### Skeletons
|
||||
We try to use the "typical" coding style for each language.
|
||||
|
||||
#### CPP
|
||||
We indent the CPP directives this way:
|
||||
|
||||
```
|
||||
#if FOO
|
||||
# if BAR
|
||||
# define BAZ
|
||||
# endif
|
||||
#endif
|
||||
```
|
||||
|
||||
Don't indent with leading spaces in the skeletons (it's OK in the grammar
|
||||
files though, e.g., in `%code {...}` blocks).
|
||||
|
||||
On occasions, use `cppi -c` to see where we stand. We don't aim at full
|
||||
correctness: depending `-d`, some bits can be in the *.c file, or the *.h
|
||||
file within the double-inclusion cpp-guards. In that case, favor the case
|
||||
of the *.h file, but don't waste time on this.
|
||||
|
||||
Don't hesitate to leave a comment on the `#endif` (e.g., `#endif /* FOO
|
||||
*/`), especially for long blocks.
|
||||
|
||||
There is no consistency on `! defined` vs. `!defined`. The day gnulib
|
||||
decides, we'll follow them.
|
||||
|
||||
#### C/C++
|
||||
Follow the GNU Coding Standards.
|
||||
|
||||
The `glr.c` skeleton was implemented with `camlCase`. We are migrating it
|
||||
to `snake_case`. Because we are standardizing the code, it is currently
|
||||
inconsistent.
|
||||
|
||||
Use `YYFOO` and `yyfoo` for entities that are exposed to the user. They are
|
||||
part of our contract with the users wrt backward compatibility.
|
||||
|
||||
Use `YY_FOO` and `yy_foo` for private matters. Users should not use them,
|
||||
we are free to change them without fear of backward compatibility issues.
|
||||
|
||||
Use `*_t` for types, especially for `yy*_t` in which case we shouldn't worry
|
||||
about the C standard introducing such a name.
|
||||
|
||||
#### C++
|
||||
Follow the C++ Core Guidelines
|
||||
(http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). The Google
|
||||
ones may be interesting too
|
||||
(https://google.github.io/styleguide/cppguide.html).
|
||||
|
||||
Our enumerators, such as the kinds (symbol and token kinds), should be lower
|
||||
case, but it was too late to follow that track for token kinds, and symbol
|
||||
kind enumerators are made to be consistent with them.
|
||||
|
||||
Use `*_type` for type aliases. Use `foo_get()` and `foo_set(v)` for
|
||||
accessors, or simply `foo()` and `foo(v)`.
|
||||
|
||||
Use the `yy` prefix for private stuff, but there's no need for it in the
|
||||
public API. The `yy` prefix is already taken care of via the namespace.
|
||||
|
||||
#### Java
|
||||
We follow https://www.oracle.com/technetwork/java/codeconventions-150003.pdf
|
||||
and https://google.github.io/styleguide/javaguide.html. Unfortunately at
|
||||
some point some GNU Coding Style was installed in Java, but it's an error.
|
||||
So we should for instance stop putting spaces in function calls. Because we
|
||||
are standardizing the code, it is currently inconsistent.
|
||||
|
||||
Use a 2-space indentation (Google) rather than 4 (Oracle).
|
||||
|
||||
Don't use the "yy" prefix for public members: "getExpectedTokens", not
|
||||
"yyexpectedTokens" or "yygetExpectedTokens".
|
||||
|
||||
## Commit Messages
|
||||
Imitate the style we use. Use `git log` to get sources of inspiration.
|
||||
|
||||
If the changes have a small impact on Bison's generated parser, embed these
|
||||
changes in the commit itself. If the impact is large, first push all the
|
||||
changes except those about src/parse-gram.[ch], and then another commit
|
||||
named "regen" which is only about them.
|
||||
|
||||
## Debugging
|
||||
Bison supports tracing of its various steps, via the `--trace` option.
|
||||
Since it is not meant for the end user, it is not displayed by `bison
|
||||
--help`, nor is it documented in the manual. Instead, run `bison
|
||||
--trace=help`.
|
||||
|
||||
## Documentation
|
||||
Use `@option` for options and options with their argument if they have no
|
||||
space (e.g., `@option{-Dfoo=bar}`). However, use `@samp` elsewhere (e.g.,
|
||||
`@samp{-I foo}`).
|
||||
|
||||
|
||||
Working from the Repository
|
||||
===========================
|
||||
|
||||
@@ -357,6 +178,196 @@ version, compile bison, then force it to recreate the files:
|
||||
$ make -C _build
|
||||
|
||||
|
||||
Administrivia
|
||||
=============
|
||||
|
||||
## If you incorporate a change from somebody on the net:
|
||||
First, if it is a large change, you must make sure they have signed the
|
||||
appropriate paperwork. Second, be sure to add their name and email address
|
||||
to THANKS.
|
||||
|
||||
## If a change fixes a test, mention the test in the commit message.
|
||||
|
||||
## Bug reports
|
||||
If somebody reports a new bug, mention his name in the commit message and in
|
||||
the test case you write. Put him into THANKS.
|
||||
|
||||
The correct response to most actual bugs is to write a new test case which
|
||||
demonstrates the bug. Then fix the bug, re-run the test suite, and check
|
||||
everything in.
|
||||
|
||||
|
||||
|
||||
Hacking
|
||||
=======
|
||||
|
||||
## Visible Changes
|
||||
Which include serious bug fixes, must be mentioned in NEWS.
|
||||
|
||||
## Translations
|
||||
Only user visible strings are to be translated: error messages, bits of the
|
||||
.output file etc. This excludes impossible error messages (comparable to
|
||||
assert/abort), and all the --trace output which is meant for the maintainers
|
||||
only.
|
||||
|
||||
## Vocabulary
|
||||
- "nonterminal", not "variable" or "non-terminal" or "non terminal".
|
||||
Abbreviated as "nterm".
|
||||
- "shift/reduce" and "reduce/reduce", not "shift-reduce" or "shift reduce",
|
||||
etc.
|
||||
|
||||
## Syntax Highlighting
|
||||
It's quite nice to be in C++ mode when editing lalr1.cc for instance.
|
||||
However tools such as Emacs will be fooled by the fact that braces and
|
||||
parens do not nest, as in `[[}]]`. As a consequence you might be misguided
|
||||
by its visual pairing to parens. The m4-mode is safer. Unfortunately the
|
||||
m4-mode is also fooled by `#` which is sees as a comment, stops pairing with
|
||||
parens/brackets that are inside...
|
||||
|
||||
## Implementation Notes
|
||||
There are several places with interesting details about the implementation:
|
||||
- [Understanding C parsers generated by GNU
|
||||
Bison](https://www.cs.uic.edu/~spopuri/cparser.html) by Satya Kiran Popuri,
|
||||
is a wonderful piece of work that explains the implementation of Bison,
|
||||
- [src/gram.h](src/gram.h) documents the way the grammar is represented
|
||||
- [src/tables.h](src/tables.h) documents the generated tables
|
||||
- [data/README.md](data/README.md) contains details about the m4 implementation
|
||||
|
||||
## Coding Style
|
||||
Do not add horizontal tab characters to any file in Bison's repository
|
||||
except where required. For example, do not use tabs to format C code.
|
||||
However, make files, ChangeLog, and some regular expressions require tabs.
|
||||
Also, test cases might need to contain tabs to check that Bison properly
|
||||
processes tabs in its input.
|
||||
|
||||
Prefer `res` as the name of the local variable that will be "return"ed by
|
||||
the function.
|
||||
|
||||
### Bison
|
||||
Follow the GNU Coding Standards.
|
||||
|
||||
Don't reinvent the wheel: we use gnulib, which features many components.
|
||||
Actually, Bison has legacy code that we should replace with gnulib modules
|
||||
(e.g., many ad hoc implementations of lists).
|
||||
|
||||
#### Includes
|
||||
The `#include` directives follow an order:
|
||||
- first section for *.c files is `<config.h>`. Don't include it in header
|
||||
files
|
||||
- then, for *.c files, the corresponding *.h file
|
||||
- then possibly the `"system.h"` header
|
||||
- then the system headers.
|
||||
Consider headers from `lib/` like system headers (i.e., `#include
|
||||
<verify.h>`, not `#include "verify.h"`).
|
||||
- then headers from src/ with double quotes (`#include "getargs.h"`).
|
||||
|
||||
Keep headers sorted alphabetically in each section.
|
||||
|
||||
See also the [Header
|
||||
files](https://www.gnu.org/software/gnulib/manual/html_node/Header-files.html)
|
||||
and the [Implementation
|
||||
files](https://www.gnu.org/software/gnulib/manual/html_node/Implementation-files.html#Implementation-files)
|
||||
nodes of the gnulib documentation.
|
||||
|
||||
Some source files are in the build tree (e.g., `src/scan-gram.c` made from
|
||||
`src/scan-gram.l`). For them to find the headers from `src/`, we actually
|
||||
use `#include "src/getargs.h"` instead of `#include "getargs.h"`---that
|
||||
saves us from additional `-I` flags.
|
||||
|
||||
### Skeletons
|
||||
We try to use the "typical" coding style for each language.
|
||||
|
||||
#### CPP
|
||||
We indent the CPP directives this way:
|
||||
|
||||
```
|
||||
#if FOO
|
||||
# if BAR
|
||||
# define BAZ
|
||||
# endif
|
||||
#endif
|
||||
```
|
||||
|
||||
Don't indent with leading spaces in the skeletons (it's OK in the grammar
|
||||
files though, e.g., in `%code {...}` blocks).
|
||||
|
||||
On occasions, use `cppi -c` to see where we stand. We don't aim at full
|
||||
correctness: depending `-d`, some bits can be in the *.c file, or the *.h
|
||||
file within the double-inclusion cpp-guards. In that case, favor the case
|
||||
of the *.h file, but don't waste time on this.
|
||||
|
||||
Don't hesitate to leave a comment on the `#endif` (e.g., `#endif /* FOO
|
||||
*/`), especially for long blocks.
|
||||
|
||||
There is no consistency on `! defined` vs. `!defined`. The day gnulib
|
||||
decides, we'll follow them.
|
||||
|
||||
#### C/C++
|
||||
Follow the GNU Coding Standards.
|
||||
|
||||
The `glr.c` skeleton was implemented with `camlCase`. We are migrating it
|
||||
to `snake_case`. Because we are gradually standardizing the code, it is
|
||||
currently inconsistent.
|
||||
|
||||
Use `YYFOO` and `yyfoo` for entities that are exposed to the user. They are
|
||||
part of our contract with the users wrt backward compatibility.
|
||||
|
||||
Use `YY_FOO` and `yy_foo` for private matters. Users should not use them,
|
||||
we are free to change them without fear of backward compatibility issues.
|
||||
|
||||
Use `*_t` for types, especially for `yy*_t` in which case we shouldn't worry
|
||||
about the C standard introducing such a name.
|
||||
|
||||
#### C++
|
||||
Follow the [C++ Core
|
||||
Guidelines](http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines).
|
||||
The [Google ones](https://google.github.io/styleguide/cppguide.html) may be
|
||||
interesting too.
|
||||
|
||||
Our enumerators, such as the kinds (symbol and token kinds), should be lower
|
||||
case, but it was too late to follow that track for token kinds, and symbol
|
||||
kind enumerators are made to be consistent with them.
|
||||
|
||||
Use `*_type` for type aliases. Use `foo_get()` and `foo_set(v)` for
|
||||
accessors, or simply `foo()` and `foo(v)`.
|
||||
|
||||
Use the `yy` prefix for private stuff, but there's no need for it in the
|
||||
public API. The `yy` prefix is already taken care of via the namespace.
|
||||
|
||||
#### Java
|
||||
We follow the [Java Code
|
||||
Conventions](https://www.oracle.com/technetwork/java/codeconventions-150003.pdf)
|
||||
and [Google Java Style
|
||||
Guide](https://google.github.io/styleguide/javaguide.html). Unfortunately
|
||||
at some point some GNU Coding Style was installed in Java, but it's an
|
||||
error. So we should for instance stop putting spaces in function calls.
|
||||
Because we are standardizing the code, it is currently inconsistent.
|
||||
|
||||
Use a 2-space indentation (Google) rather than 4 (Oracle).
|
||||
|
||||
Don't use the "yy" prefix for public members: "getExpectedTokens", not
|
||||
"yyexpectedTokens" or "yygetExpectedTokens".
|
||||
|
||||
## Commit Messages
|
||||
Imitate the style we use. Use `git log` to get sources of inspiration.
|
||||
|
||||
If the changes have a small impact on Bison's generated parser, embed these
|
||||
changes in the commit itself. If the impact is large, first push all the
|
||||
changes except those about src/parse-gram.[ch], and then another commit
|
||||
named "regen" which is only about them.
|
||||
|
||||
## Debugging
|
||||
Bison supports tracing of its various steps, via the `--trace` option.
|
||||
Since it is not meant for the end user, it is not displayed by `bison
|
||||
--help`, nor is it documented in the manual. Instead, run `bison
|
||||
--trace=help`.
|
||||
|
||||
## Documentation
|
||||
Use `@option` for options and options with their argument if they have no
|
||||
space (e.g., `@option{-Dfoo=bar}`). However, use `@samp` elsewhere (e.g.,
|
||||
`@samp{-I foo}`).
|
||||
|
||||
|
||||
Test Suite
|
||||
==========
|
||||
|
||||
@@ -366,9 +377,9 @@ examples, and the main test suite.
|
||||
|
||||
### The Examples
|
||||
In examples/, there is a number of ready-to-use examples (see
|
||||
examples/README.md). These examples have small test suites run by `make
|
||||
check`. The test results are in local `*.log` files (e.g.,
|
||||
`$build/examples/c/calc/calc.log`).
|
||||
[examples/README.md](examples/README.md)). These examples have small test
|
||||
suites run by `make check`. The test results are in local `*.log` files
|
||||
(e.g., `$build/examples/c/calc/calc.log`).
|
||||
|
||||
### The Main Test Suite
|
||||
The main test suite, in tests/, is written on top of GNU Autotest, which is
|
||||
@@ -548,7 +559,8 @@ re-run the tests, run:
|
||||
Release Procedure
|
||||
=================
|
||||
|
||||
See README-release.
|
||||
See the [README-release file](README-release), created when the package is
|
||||
bootstrapped.
|
||||
|
||||
<!--
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
This directory contains data needed by Bison.
|
||||
|
||||
# Directory content
|
||||
# Directory Content
|
||||
## Skeletons
|
||||
Bison skeletons: the general shapes of the different parser kinds, that are
|
||||
specialized for specific grammars by the bison program.
|
||||
@@ -48,7 +48,7 @@ various formats.
|
||||
- xml2xhtml.xsl
|
||||
Conversion into XHTML.
|
||||
|
||||
# Implementation note about the skeletons
|
||||
# Implementation Notes About the Skeletons
|
||||
|
||||
"Skeleton" in Bison parlance means "backend": a skeleton is fed by the bison
|
||||
executable with LR tables, facts about the symbols, etc. and they generate
|
||||
@@ -179,7 +179,7 @@ The data corresponding to the symbol `#POS`, where the current rule has
|
||||
Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols
|
||||
on RHS.
|
||||
|
||||
-----
|
||||
<!--
|
||||
|
||||
Local Variables:
|
||||
mode: markdown
|
||||
@@ -203,3 +203,5 @@ GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program. If not, see <http://www.gnu.org/licenses/>.
|
||||
|
||||
-->
|
||||
|
||||
Reference in New Issue
Block a user