This shortens the lexer by 100 lines and simplifies
access to expansion contents, since it usually needs the
deepest one, not the top-level one.
Fixes#813
lexerState->expansionOfs is always either set to 0, or updated by
adding a positive quantity:
if (distance > lexerState->expansions->distance) {
lexerState->expansionOfs += distance - lexerState->expansions->distance;
...
}
so it will always be positive or zero.
Only two sites were for distances greater than 1:
a `shiftChars(2)`, trivial to just do two `shiftChar()`s;
and `shiftChars(size)` in `reportGarbageChar`, which
can be a `for` loop, and should be fixed anyway to
"avoid having to peek further than 0".
Instead of defining `LOOKUP_PRE_NEST` and `LOOKUP_POST_NEST`,
pass a variable name and a block to `lookupExpansion`; it
will assign successive looked-up expansions to the variable
and use them in the block.
The technique of using `__VA_ARGS__` to allow commas within a
block passed to a macro is not original, and should be stable.
This macro was only used twice, in `beginExpansion` and
`lexer_DumpStringExpansions`, with `getExpansionAtDistance`
already containing an inlined and slightly modified version
of `lookupExpansion` (retaining the `LOOKUP_PRE_NEST` and
`LOOKUP_POST_NEST` macros, but with both of them doing nothing).
Not using an X macro here makes the actual control flow in both
places more obvious, and I think the repeated code is acceptable
for the same reasons as the similar-but-distinct implementations
of `readString`, `appendStringLiteral`, `yylex_NORMAL`, and
`yylex_RAW`.
This decoding required high lookahead, and was not even
consistently useful (the `garbage_char` test case was not
valid UTF-8 and so did not benefit from `reportGarbageChar`).
This limits UTF-8 handling to the `STRLEN` and `STRSUB`
built-in functions, and to charmap conversion.
Fixes the test case from #800
The `out.gb` output was corrected, since the two "test"
fragments have a different order in ROM than in SRAM.
It is effectively:
; ROM0[$0000], fragments ordered by size
jr Label
dw Label
db 0
; SRAM[$a000], fragments ordered by .o order
ds 1 ; db 0
ds 2 ; jr Label
Label: ; $a003
ds 2 ; dw Label
Macro args were already handled by `peek`, and character escapes
do not exist outside of string literals.
This only affects the error message printed when a non-whitespace
character comes after the backslash. Instead of "Illegal character
escape '%s'", it will print "Begun line continuation, but
encountered character '%s'".
It still gets written to the object file, generating Valgrind warnings
about using uninitialized memory. To silence those errors, and make
output more reproducible, init the line no to a dummy (0) value.