After discussion (starting there:
https://github.com/gbdev/rgbds/pull/594#issuecomment-706437458
), it was decided that plain nested macros should not be
allowed.
Since #590 is fixed, EQUS can be used as a workaround;
multiline strings (#589) will make that easier on the
user when implemented.
Fixes#588, supersedes and closes#594.
Additionally, closes#388.
The old "find symbol with auto scope" function is now three:
- One finds the exact name passed to it, skipping any checks
This is useful e.g. if such checks were already performed.
- One checks that the name is not scoped, and calls the first.
This is useful for names that cannot be scoped, such as checking for EQUS.
Doing this instead of the third should improve performance somehwat, since
this specific case is hit by the lexer each time an identifier is read.
- The last one checks if the name should be expanded (`.loc` → `Glob.loc`),
and that the local part is not scoped. This is essentially the old function.
Since the lexer buffer wraps, the refilling gets handled in two steps:
First, iff the buffer would wrap, the buffer is refilled until its end.
Then, if more characters are requested, that amount is refilled too.
An important detail is that `read()` may not return as many characters as
requested; for this reason, the first step checks if its `read()` was
"full", and skips the second step otherwise.
This is also where a bug lied.
After a *lot* of trying, I eventually managed to reproduce the bug on an
OpenBSD VM, and after adding a couple of `assert`s in `peekInternal`, this
is what happened, starting at line 724:
0. `lexerState->nbChars` is 0, `lexerState->index` is 19;
1. We end up with `target` = 42, and `writeIndex` = 19;
2. 42 + 19 is greater than `LEXER_BUF_SIZE` (= 42), so the `if` is entered;
3. Within the first `readChars`, **`read` only returns 16 bytes**,
advancing `writeIndex` to 35 and `target` to 26;
4. Within the second `readChars`, a `read(26)` is issued, overflowing the
buffer.
The bug should be clear now: **the check at line 750 failed to work!** Why?
Because `readChars` modifies `writeIndex`.
The fix is simply to cache the number of characters expected, and use that.
They were expanded during the capture, and there was no easy way to
avoid expanding them (believe me, after three hours and somehow an OOM, I
gave up trying).
The biggest problem was simply that the length of children expansions was
not accounted for when skipping over the parent... this took a lot of
arduous debugging, but it finally works!
And fix line counting with expansion-made newlines.
This has the same bug as the old lexer (equs-newline's output does not
print the second warning as being part of the expansion).
Additionally, we regress equs-recursion, as we are no longer able to
catch this specific EQUS recursion. Simply enough, the new expansion
begins **after** the old one ends! I have found no way to handle that.
Add keywords and identifiers
Add comments
Add number literals
Add strings
Add a lot of new tokens
Add (and clean up) IF etc.
Improve reporting of unexpected chars / garbage bytes
Fix bug with and improved error messages when failing to open file
Add verbose-level messages about how files are opened
Enforce that files finish with a newline
Fix chars returned not being cast to unsigned char (may conflict w/ EOF)
Return null path when no file is open, rather than crash
Unify and improve error printing slightly
Known to be missing: macro expansion, REPT blocks, EQUS expansions