Since the lexer buffer wraps, the refilling gets handled in two steps:
First, iff the buffer would wrap, the buffer is refilled until its end.
Then, if more characters are requested, that amount is refilled too.
An important detail is that `read()` may not return as many characters as
requested; for this reason, the first step checks if its `read()` was
"full", and skips the second step otherwise.
This is also where a bug lied.
After a *lot* of trying, I eventually managed to reproduce the bug on an
OpenBSD VM, and after adding a couple of `assert`s in `peekInternal`, this
is what happened, starting at line 724:
0. `lexerState->nbChars` is 0, `lexerState->index` is 19;
1. We end up with `target` = 42, and `writeIndex` = 19;
2. 42 + 19 is greater than `LEXER_BUF_SIZE` (= 42), so the `if` is entered;
3. Within the first `readChars`, **`read` only returns 16 bytes**,
advancing `writeIndex` to 35 and `target` to 26;
4. Within the second `readChars`, a `read(26)` is issued, overflowing the
buffer.
The bug should be clear now: **the check at line 750 failed to work!** Why?
Because `readChars` modifies `writeIndex`.
The fix is simply to cache the number of characters expected, and use that.
They were expanded during the capture, and there was no easy way to
avoid expanding them (believe me, after three hours and somehow an OOM, I
gave up trying).
The biggest problem was simply that the length of children expansions was
not accounted for when skipping over the parent... this took a lot of
arduous debugging, but it finally works!
And fix line counting with expansion-made newlines.
This has the same bug as the old lexer (equs-newline's output does not
print the second warning as being part of the expansion).
Additionally, we regress equs-recursion, as we are no longer able to
catch this specific EQUS recursion. Simply enough, the new expansion
begins **after** the old one ends! I have found no way to handle that.
Add keywords and identifiers
Add comments
Add number literals
Add strings
Add a lot of new tokens
Add (and clean up) IF etc.
Improve reporting of unexpected chars / garbage bytes
Fix bug with and improved error messages when failing to open file
Add verbose-level messages about how files are opened
Enforce that files finish with a newline
Fix chars returned not being cast to unsigned char (may conflict w/ EOF)
Return null path when no file is open, rather than crash
Unify and improve error printing slightly
Known to be missing: macro expansion, REPT blocks, EQUS expansions
Create a new file, platform.h, for platform-specific hacks
for MSVC, this includes defining strncasecmp to _stricmp and
strdup to _strdup, among other things like defining missing
stat macros
Change some things not supported in MSVC, like _Static_assert,
to their counterparts (in this case, static_assert)
Replace usage of VLAs with malloc and free
Update getopt_long and use the getopt implementation from musl
on Windows.
Use comments to show which functions from platform.h are being used
This should help make RGBDS portable to systems with 16-bit integers,
like DOS.
For kicks, use the macros for 16-bit and 8-bit integers.
Fix other miscellaneous things, like #include ordering and other
printf-format related things.
Reduce repitition in math.c while I'm there.