Commit Graph

252 Commits

Author SHA1 Message Date
ISSOtm
b037d54f64 Remove deprecated symbols
Fixes #896
2022-05-21 21:45:06 +02:00
ISSOtm
7dd8ba37f1 Allow changing recursion depth limit at runtime 2022-02-05 20:32:56 +01:00
ISSOtm
eb5af70d79 Add unsigned right shift operator 2022-02-05 20:12:15 +01:00
Rangi
d073cffa74 Don't use new as a variable name
It conflicts with the C++ keyword
2021-11-24 22:48:28 -05:00
Rangi
8435a29c4e Turn the readChars macro into a readInternal function
This macro was only used twice, and the second usage did
some unnecessary work.
2021-11-25 00:26:23 +01:00
Rangi
aac839f389 Remove dbgPrint and TRACE_LEXER support
I have not found `TRACE_LEXER` to be useful in debugging
actual lexer issues.
2021-11-22 23:49:59 +01:00
Rangi
ec6d63bce3 Allow underscores in gfx literals (#951)
Fixes #950
2021-11-21 16:18:23 -05:00
Rangi
cedfd2582a Move more statements into for loop clauses 2021-11-19 22:55:20 -05:00
Rangi
c7322258fc Refactor readGfxConstant for consistency, and edit warning message 2021-11-19 21:36:56 -05:00
Rangi
8e2a164a32 Implement compound assignment operators for mutable constants
Fixes #943
2021-11-19 08:50:00 +01:00
Rangi
efccf6c931 A few stylistic tweaks
- `goto free_romx` -> the more typical `goto cleanup`
- `goto fail` -> the more typical `goto finish`
- Remove a redundant `todo` variable
2021-11-17 23:51:40 -05:00
Rangi
438963fb24 Remove unused #include "extern/utf8decoder.h" (#940)
Fixes #937
2021-11-12 23:37:19 +01:00
ISSOtm
1a07391a97 Introduce ARRAY_SIZE macro
Checked by `checkpatch`, and you know what? Not a bad thing
See https://github.com/gbdev/rgbds/pull/931#discussion_r738856724
2021-10-31 07:53:33 +01:00
Rangi
4a73eb56ea Make peek() tail recursive instead of using goto
Compilation is identical with `gcc` or `clang`, -O3` or `-O2`
2021-08-18 01:30:47 +02:00
Rangi
03bb510588 endCapture shouldn't handle lexerState->atLineStart
`startCapture` did not initialize `lexerState->atLineStart`;
its final value is a consequence of the separate but similar
behaviors within `lexer_CaptureRept` and `lexer_CaptureMacroBody`.
2021-07-04 18:31:46 -04:00
Rangi
695dfe9dbd Add missing file line-continuation-string.asm
Also make some minor formatting corrections
2021-07-04 16:12:34 -04:00
Rangi
9782f7d942 Factor out endCapture to go with startCapture (#904)
This also refactors `startCapture` to modify the
capture body as an argument.
2021-07-04 16:08:59 -04:00
Rangi
23721694ea Comment that anonymous labels internally start with '!'
`startsIdentifier` should not accept this character so
anonymous labels won't conflict with nonymous ones.
2021-05-15 12:57:22 -04:00
Eldred Habert
c06985a7ad Fix incorrect lexing of "$ff00+c" (#882)
Fixes #881 by moving the task from the lexer to the parser.
This both alleviates the need for backtracking in the lexer,
removing what is (was) arguably a hack, and causes tokenization
boundaries to be properly respected, fixing the issue mentioned above.

Co-authored-by: Rangi <remy.oukaour+rangi42@gmail.com>
2021-05-05 02:04:19 +02:00
ISSOtm
dcb8c69661 Fix UAF in lexer capture
Fixes #689
2021-05-02 03:24:18 +02:00
Rangi
d37aa93a7d Port some cleanup from the WIP 'strings' branch
This is mostly variable renaming
2021-04-28 11:58:56 -04:00
Rangi
3fdf01c0f5 Resolve some TODO comments
- `out_PushSection` should not set `currentSection` to NULL because
  PUSHS, PUSHC, and PUSHO consistently keep the current section,
  charmap, and options, even though the stack has been pushed.

- `Callback__FILE__` does not need to assert that `fileName` is not
  empty because `__FILE__`'s value is quoted, and can safely be empty.

- `YY_FATAL_ERROR` and `YYLMAX` are not needed since the lexer is
  not generated with flex.
2021-04-26 15:52:30 -04:00
Rangi
e050803ed1 Use size_t for measuring nested depths
Multiple functions involve tracking the current depth
of a nested structure (symbol expansions, interpolations,
REPT/FOR blocks, parentheses).
2021-04-23 14:28:10 +02:00
Rangi
27f38770d4 Parentheses in macro args prevent commas from starting new arguments
This is similar to C's behavior, and convenient for passing
function calls as single values, like `MUL(3.0, 4.0)` or
`STRSUB("str", 2, 1)`.

Fixes #704
2021-04-23 14:28:10 +02:00
Rangi
e596dbfc80 Make failed macro arg expansions non-fatal
Expanding empty strings is valid but pointless;
macro args already skipped doing so, now other
`beginExpansion` calls do too.

This also fixes failed interpolations (which were
already non-fatal) to continue reading characters,
not evaluate to their initial '{' character.
2021-04-22 09:59:02 +02:00
Rangi
c3e27217dd More specific "Symbol name too long" error messages
Identifiers, {interpolations} and \<macroArgs> are distinct
2021-04-20 17:14:21 +02:00
Rangi
fe3521c7a4 Switch from parentheses to angle brackets
`\(` is more likely to be a valid escape sequence in the
future (as is `\[`) and `\{` is already taken.
2021-04-20 17:14:21 +02:00
Rangi
7a314e7aff Support numeric symbol names in \(parentheses)
For example, \(_NARG) will get the last argument
2021-04-20 17:14:21 +02:00
Rangi
637bbbdf43 Support multi-digit macro arguments in parentheses
This allows access to arguments past \9 without using 'shift'
2021-04-20 17:14:21 +02:00
Rangi
8230e8165c Eliminate isAtEOF by changing yylex control flow
`yylex` calls `yywrap` at the beginning of the next call, after it
has set `lexerState->lastToken` to `T_EOB`.
2021-04-20 17:10:08 +02:00
Rangi
a727a0f81f Capture termination status is equivalent to not having reached EOF
This avoids the need for a separate `terminated` flag
2021-04-20 17:10:08 +02:00
Rangi
7a587eb7d6 Use midrule action values for captures' terminated status
Bison 3.1 introduces "typed midrule values", which would write
`<captureTerminated>{ ... }` and `$$` instead of `{ ... }` and
`$<captureTerminated>[1-9]`, but rgbds supports 3.0 or even lower.
2021-04-20 17:10:08 +02:00
Rangi
7ac8bd6e24 Return a marker token at the end of any buffer
Removes the lexer hack mentioned in #778
2021-04-20 17:10:08 +02:00
Rangi
be2572edca Track nested interpolation depth even outside string literals
Fixes #837
2021-04-20 09:37:29 -04:00
ISSOtm
6d0a3c75e9 Get rid of Hungarian notation for good
Bye bye it was not nice knowing ya
2021-04-19 22:12:10 +02:00
Rangi
52797b6f68 Implement SIZEOF("Section") and STARTOF("Section") (#766)
Updates the object file revision to 8

Fixes #765
2021-04-17 18:36:26 -04:00
Rangi
2005ed1df9 Implement CHARLEN and CHARSUB
Fixes #786
2021-04-17 18:18:34 -04:00
Rangi
9923fa3eee Fix expansions that start from the end of another expansion (#839)
Do not free an expansion until its offset is *past* its size.
This means potentially freeing a nested stack of expansions
all at once.

Fixes #696
2021-04-17 13:14:40 -04:00
Rangi
c755fa3469 readIdentifier does not process characters that get truncated
Previously a '.' could be past the truncation limit but still
cause the identifier to be marked as local, violating an
assertion in `sym_AddLocalLabel`.

Fixes #832
2021-04-16 21:15:01 -04:00
Rangi
e78a1d5bfd readInterpolation is limited by nMaxRecursionDepth
Fixes #837
2021-04-16 16:10:46 -04:00
Rangi
5c852c7651 Store the nested expansions starting from the deepest one (#829)
This shortens the lexer by 100 lines and simplifies
access to expansion contents, since it usually needs the
deepest one, not the top-level one.

Fixes #813
2021-04-16 09:54:13 -04:00
Rangi
6be3584467 LexerState's 'size' and 'offset' for mmapped files are unsigned
These were using signed 'off_t' because that is the type of
'st_size' from 'stat()', but neither one can be negative.
2021-04-16 10:23:37 +02:00
Rangi
8c90d9d2d7 Get rid of skip in struct Expansion
This was only used to skip the two macro arg characters,
but shiftChar() can skip them before the expansion.
2021-04-16 10:23:37 +02:00
Rangi
f69e666b00 expansionOfs cannot be negative
lexerState->expansionOfs is always either set to 0, or updated by
adding a positive quantity:

    if (distance > lexerState->expansions->distance) {
        lexerState->expansionOfs += distance - lexerState->expansions->distance;
        ...
    }

so it will always be positive or zero.
2021-04-16 10:23:37 +02:00
Rangi
eba06404f0 peek(0) => peek()
This does not completely refactor `peek` as #708 suggested,
to make it shift and cache a character itself. However it
does simplify the lexer code.
2021-04-16 10:23:37 +02:00
Rangi
9558ccea1b shiftChars(1) => shiftChar()
Only two sites were for distances greater than 1:
a `shiftChars(2)`, trivial to just do two `shiftChar()`s;
and `shiftChars(size)` in `reportGarbageChar`, which
can be a `for` loop, and should be fixed anyway to
"avoid having to peek further than 0".
2021-04-16 10:23:37 +02:00
Rangi
260d372acd Lex $ff00+c without needing large peek lookahead
This also allows arbitrary amounts of whitespace in `$ff00 + c`,
instead of needing to fit in the 42-byte LEXER_BUF_SIZE
2021-04-16 10:23:37 +02:00
Rangi
b3312886fb Use a lookupExpansion, but not as an X macro
Instead of defining `LOOKUP_PRE_NEST` and `LOOKUP_POST_NEST`,
pass a variable name and a block to `lookupExpansion`; it
will assign successive looked-up expansions to the variable
and use them in the block.

The technique of using `__VA_ARGS__` to allow commas within a
block passed to a macro is not original, and should be stable.
2021-04-13 17:58:46 +02:00
Rangi
7fc8a65d0a Refactor the lexer to not use the lookupExpansion X macro
This macro was only used twice, in `beginExpansion` and
`lexer_DumpStringExpansions`, with `getExpansionAtDistance`
already containing an inlined and slightly modified version
of `lookupExpansion` (retaining the `LOOKUP_PRE_NEST` and
`LOOKUP_POST_NEST` macros, but with both of them doing nothing).

Not using an X macro here makes the actual control flow in both
places more obvious, and I think the repeated code is acceptable
for the same reasons as the similar-but-distinct implementations
of `readString`, `appendStringLiteral`, `yylex_NORMAL`, and
`yylex_RAW`.
2021-04-13 17:58:46 +02:00
Rangi
a2f52867ad Rename print to printChar
This clarifies its usage, for printing a single character
in error messages.
2021-04-13 17:41:12 +02:00