Commit Graph

240 Commits

Author SHA1 Message Date
ISSOtm
2b6d9cd1e0 Avoid using yytoken_kind_t
Apparently it was added in a fairly recent Bison version...
2020-12-10 13:32:18 +01:00
ISSOtm
9b6f01047c Enable raw token types
Removes one layer of indirection for the parser, and helps remove all literals from the grammar

The latter preparing the next change
2020-12-09 21:22:05 +01:00
ISSOtm
462fd7539c Prohibit nested macros
After discussion (starting there:
https://github.com/gbdev/rgbds/pull/594#issuecomment-706437458
), it was decided that plain nested macros should not be
allowed.
Since #590 is fixed, EQUS can be used as a workaround;
multiline strings (#589) will make that easier on the
user when implemented.
Fixes #588, supersedes and closes #594.
Additionally, closes #388.
2020-12-09 10:44:39 +01:00
ISSOtm
f16e34b804 Fix captures beginning in expansions
Fixes #590
2020-12-09 09:54:55 +01:00
ISSOtm
4f842a1248 Create specialized symbol finder functions
The old "find symbol with auto scope" function is now three:
- One finds the exact name passed to it, skipping any checks
  This is useful e.g. if such checks were already performed.
- One checks that the name is not scoped, and calls the first.
  This is useful for names that cannot be scoped, such as checking for EQUS.
  Doing this instead of the third should improve performance somehwat, since
  this specific case is hit by the lexer each time an identifier is read.
- The last one checks if the name should be expanded (`.loc` → `Glob.loc`),
  and that the local part is not scoped. This is essentially the old function.
2020-11-21 01:06:17 +01:00
anderoonies
55be77be69 discard block comments delimited with /* */ 2020-10-15 12:42:53 -04:00
ISSOtm
01637768cf Rename asmy to more explicit parser
This should make the purpose of that file clearer to newcomers
2020-10-11 21:03:41 +02:00
ISSOtm
06f7387466 Avoid using VLA in EQUS dumping
MSVC does not support those...
Also add a `develop` warning about VLAs, to avoid future incidents
2020-10-06 08:55:45 +02:00
ISSOtm
21e50eeff1 Have lexer not require <unistd.h> on MSVC
Required for `open`, `close`, `read`, and `STDIN_FILENO`,
which are defined elsewhere on MSVC.
2020-10-06 08:55:45 +02:00
ISSOtm
2eca43cd2d Fix critical oversight in lexer buffer refilling
Since the lexer buffer wraps, the refilling gets handled in two steps:
First, iff the buffer would wrap, the buffer is refilled until its end.
Then, if more characters are requested, that amount is refilled too.

An important detail is that `read()` may not return as many characters as
requested; for this reason, the first step checks if its `read()` was
"full", and skips the second step otherwise.
This is also where a bug lied.

After a *lot* of trying, I eventually managed to reproduce the bug on an
OpenBSD VM, and after adding a couple of `assert`s in `peekInternal`, this
is what happened, starting at line 724:

0. `lexerState->nbChars` is 0, `lexerState->index` is 19;
1. We end up with `target` = 42, and `writeIndex` = 19;
2. 42 + 19 is greater than `LEXER_BUF_SIZE` (= 42), so the `if` is entered;
3. Within the first `readChars`, **`read` only returns 16 bytes**,
   advancing `writeIndex` to 35 and `target` to 26;
4. Within the second `readChars`, a `read(26)` is issued, overflowing the
   buffer.

The bug should be clear now: **the check at line 750 failed to work!** Why?
Because `readChars` modifies `writeIndex`.
The fix is simply to cache the number of characters expected, and use that.
2020-10-04 16:10:32 +02:00
ISSOtm
423a7c4899 Handle \\r better
Translate it to \\n regardless of the lexer mode
2020-10-04 04:46:01 +02:00
ISSOtm
930080f556 Mark not unmapping macro-containing files as okay
There isn't really a better alternative.
Making several mappings instead requires too much bookkeeping.
2020-10-04 04:46:01 +02:00
ISSOtm
8e7afb0ab3 Move some MSVC-specific defines to platform.h 2020-10-04 04:46:01 +02:00
ISSOtm
138523570e Fix possible uninitialized read on Windows 2020-10-04 04:46:01 +02:00
ISSOtm
82469ac0fd Shim around mmap on Windows 2020-10-04 04:46:01 +02:00
ISSOtm
b224cab3e0 Harmonize printing distance 2020-10-04 04:46:01 +02:00
ISSOtm
dbef51ba05 Move isWhitespace to a place where it makes more sense 2020-10-04 04:46:01 +02:00
ISSOtm
c952dd8a6e Fix fixed-point constants not working correctly
And added a test to check their behavior
2020-10-04 04:46:01 +02:00
ISSOtm
542b5d18f1 Fix possible capture buffer size overflow
Attempt to grow it to the max size first.
Seriously, if this triggers, *how*
2020-10-04 04:46:01 +02:00
ISSOtm
71a0a42cfb Fix C2x use of static_assert 2020-10-04 04:46:01 +02:00
ISSOtm
ac011fe69f Use common function to discard comments in macro args 2020-10-04 04:46:01 +02:00
ISSOtm
9e3d7a50e6 Handle comments in line continuations 2020-10-04 04:46:00 +02:00
ISSOtm
615f1072d9 Fix readFractionalPart never shifting characters 2020-10-04 04:46:00 +02:00
ISSOtm
f7b7a97407 Prevent expanding macro args in comments
Also use a cleaner way, instead of hardcoding to capture
2020-10-04 04:46:00 +02:00
ISSOtm
ece6853e0f Implement opt b and opt g 2020-10-04 04:46:00 +02:00
ISSOtm
aa76603da9 Add line+col trace info to lexer 2020-10-04 04:45:59 +02:00
ISSOtm
35396e6410 Fix files being unmapped when still referenced by macros 2020-10-04 04:45:59 +02:00
ISSOtm
8d18b39eee Support missing register tokens
Made possible by #491
2020-10-04 04:45:59 +02:00
ISSOtm
e4f2fad215 Support line continuations in main scope 2020-10-04 04:45:58 +02:00
ISSOtm
3f5f9bcaf0 Fix numeric constant overflow checks 2020-10-04 04:45:58 +02:00
ISSOtm
08867b3cec Enable catching invalid macro arg 0 2020-10-04 04:45:55 +02:00
ISSOtm
9081feab51 Reinstate macro arg scan distance
Used to be broken, so it was removed, but doing so prevents escaping them.
So it was instead put back in, but with corrected behavior
2020-10-04 04:39:27 +02:00
ISSOtm
cf992164f7 Fix lexer capture sometimes not being reset 2020-10-04 04:39:27 +02:00
ISSOtm
b27b821e7f Fix RAW lexer length underflow
Also added an assertion to check against more such overflows
2020-10-04 04:39:26 +02:00
ISSOtm
d9ecaabac1 Add debug tracing code to lexer
Hidden behind a #define, like YYDEBUG
2020-10-04 04:39:26 +02:00
ISSOtm
cd747d8175 Fix many lexer bugs
More to come...
2020-10-04 04:39:25 +02:00
ISSOtm
df75fd2ec2 Fix expansion reporting being incorrect 2020-10-04 04:38:53 +02:00
ISSOtm
adcaf4cd46 Fix crash when no macro args are being used 2020-10-04 04:38:53 +02:00
ISSOtm
81a77a9b88 Re-implement block copy to avoid expanding macro args
They were expanded during the capture, and there was no easy way to
avoid expanding them (believe me, after three hours and somehow an OOM, I
gave up trying).
2020-10-04 04:38:53 +02:00
ISSOtm
6e805cd318 Implement macro args
This finally allows running 90% of the test suite, debugging time!
2020-10-04 04:38:53 +02:00
ISSOtm
e11f25024e Add test for built-in file symbol
It's currently defined in fstack.c, making it more prone to accidental
dropping. Let's not repeat the 0.3.9 scenario...
2020-10-04 04:38:53 +02:00
ISSOtm
38bda7e1bb Fix string expansion reporting
More expansions were allowed than the limit specified, and reporting code
did not account for the extra one that caused overflow
2020-10-04 04:38:52 +02:00
ISSOtm
149db9a022 Fix incorrect freeing of expansions
Freeing an expansion should free its children, not its siblings...
Fixes a use-after-free reported by scan-build. Nice catch!
2020-10-04 04:38:52 +02:00
ISSOtm
fed252bc49 Fix nested expansions being incorrectly handled
The biggest problem was simply that the length of children expansions was
not accounted for when skipping over the parent... this took a lot of
arduous debugging, but it finally works!
2020-10-04 04:38:52 +02:00
ISSOtm
61b2fd9816 Add string expansion reporting
And fix line counting with expansion-made newlines.
This has the same bug as the old lexer (equs-newline's output does not
print the second warning as being part of the expansion).
Additionally, we regress equs-recursion, as we are no longer able to
catch this specific EQUS recursion. Simply enough, the new expansion
begins **after** the old one ends! I have found no way to handle that.
2020-10-04 04:38:52 +02:00
ISSOtm
5ad7a93750 Add EQUS expansion 2020-10-04 04:38:52 +02:00
ISSOtm
2ec10012b6 Fix mmap read offset not being initialized 2020-10-04 04:38:52 +02:00
ISSOtm
e56c6cc291 Fix PC's name not being passed to parser 2020-10-04 04:38:52 +02:00
ISSOtm
4c9a929a14 Implement almost all functionality
Add keywords and identifiers
Add comments
Add number literals
Add strings
Add a lot of new tokens
Add (and clean up) IF etc.
Improve reporting of unexpected chars / garbage bytes
Fix bug with and improved error messages when failing to open file
Add verbose-level messages about how files are opened
Enforce that files finish with a newline
Fix chars returned not being cast to unsigned char (may conflict w/ EOF)
Return null path when no file is open, rather than crash
Unify and improve error printing slightly

Known to be missing: macro expansion, REPT blocks, EQUS expansions
2020-10-04 04:38:50 +02:00
ISSOtm
71f8871702 Implement more functionality
Macro arg detection, first emitted tokens, primitive (bad) column counting
2020-10-04 04:37:58 +02:00