Commit Graph

1302 Commits

Author SHA1 Message Date
Rangi 3fdf01c0f5 Resolve some TODO comments
- `out_PushSection` should not set `currentSection` to NULL because
  PUSHS, PUSHC, and PUSHO consistently keep the current section,
  charmap, and options, even though the stack has been pushed.

- `Callback__FILE__` does not need to assert that `fileName` is not
  empty because `__FILE__`'s value is quoted, and can safely be empty.

- `YY_FATAL_ERROR` and `YYLMAX` are not needed since the lexer is
  not generated with flex.
2021-04-26 15:52:30 -04:00
Rangi 43cf20b155 Support Mac OS classic CR line endings in linkerscripts
This also refactors `readChar(file)` to `nextChar()` to be
more like the rgbasm lexer.
2021-04-26 12:05:36 -04:00
Rangi e27a6d53a0 Support character escapes in linkerscript strings
This allows linkerscripts to refer to section names even if
they contain special characters: '\r' '\n' '\t' '"' '\\'.
2021-04-26 12:05:36 -04:00
ISSOtm dd8f396227 Fix compiler warnings
As reported in #789
2021-04-25 20:40:11 +02:00
ISSOtm b60853ea21 Fix RGBFIX option parsing on platforms with unsigned char
Such as Termux, once again.
2021-04-25 11:05:34 +02:00
Rangi e050803ed1 Use size_t for measuring nested depths
Multiple functions involve tracking the current depth
of a nested structure (symbol expansions, interpolations,
REPT/FOR blocks, parentheses).
2021-04-23 14:28:10 +02:00
Rangi 27f38770d4 Parentheses in macro args prevent commas from starting new arguments
This is similar to C's behavior, and convenient for passing
function calls as single values, like `MUL(3.0, 4.0)` or
`STRSUB("str", 2, 1)`.

Fixes #704
2021-04-23 14:28:10 +02:00
ISSOtm db1f77f90b Correct "| operator" line not including the pipe 2021-04-23 14:24:53 +02:00
Rangi 4d21588eb2 Make invalid UTF-8 characters in strings non-fatal
STRLEN and STRSUB report the erroneous bytes

Fixes #848
2021-04-22 09:59:02 +02:00
Rangi e596dbfc80 Make failed macro arg expansions non-fatal
Expanding empty strings is valid but pointless;
macro args already skipped doing so, now other
`beginExpansion` calls do too.

This also fixes failed interpolations (which were
already non-fatal) to continue reading characters,
not evaluate to their initial '{' character.
2021-04-22 09:59:02 +02:00
Rangi 267e4bc25c rgbds.7(7) shows an example of piping rgbasm to rgblink to rgbfix
This uses one line instead of three
2021-04-20 22:06:02 -04:00
Rangi c3e27217dd More specific "Symbol name too long" error messages
Identifiers, {interpolations} and \<macroArgs> are distinct
2021-04-20 17:14:21 +02:00
Rangi fe3521c7a4 Switch from parentheses to angle brackets
`\(` is more likely to be a valid escape sequence in the
future (as is `\[`) and `\{` is already taken.
2021-04-20 17:14:21 +02:00
Rangi 7a314e7aff Support numeric symbol names in \(parentheses)
For example, \(_NARG) will get the last argument
2021-04-20 17:14:21 +02:00
Rangi 637bbbdf43 Support multi-digit macro arguments in parentheses
This allows access to arguments past \9 without using 'shift'
2021-04-20 17:14:21 +02:00
Rangi 8230e8165c Eliminate isAtEOF by changing yylex control flow
`yylex` calls `yywrap` at the beginning of the next call, after it
has set `lexerState->lastToken` to `T_EOB`.
2021-04-20 17:10:08 +02:00
Rangi a727a0f81f Capture termination status is equivalent to not having reached EOF
This avoids the need for a separate `terminated` flag
2021-04-20 17:10:08 +02:00
Rangi 7a587eb7d6 Use midrule action values for captures' terminated status
Bison 3.1 introduces "typed midrule values", which would write
`<captureTerminated>{ ... }` and `$$` instead of `{ ... }` and
`$<captureTerminated>[1-9]`, but rgbds supports 3.0 or even lower.
2021-04-20 17:10:08 +02:00
Rangi 7ac8bd6e24 Return a marker token at the end of any buffer
Removes the lexer hack mentioned in #778
2021-04-20 17:10:08 +02:00
Rangi be2572edca Track nested interpolation depth even outside string literals
Fixes #837
2021-04-20 09:37:29 -04:00
Rangi cf2bbe6435 Position -1 is the last character of a string
Position 0 is invalid, which matches with STRIN/STRRIN
returning 0 on failure.
2021-04-20 14:27:59 +02:00
Rangi dc5b7802c8 Make the len parameter optional in STRSUB(str, pos, len)
An unspecified length will continue to the end of the string.
2021-04-20 14:27:59 +02:00
Rangi b1e6c73197 STRSUB and CHARSUB allow zero or negative positions
These are offsets from the end of the string, as if the
STRLEN or CHARLEN respectively were added to the position.

Fixes #812
2021-04-20 14:27:59 +02:00
Rangi 459773b3f0 Update some whitespace after Hungarian prefixes were removed
Keep the parameter alignment and 100-char line limit
2021-04-19 16:47:39 -04:00
ISSOtm 6d0a3c75e9 Get rid of Hungarian notation for good
Bye bye it was not nice knowing ya
2021-04-19 22:12:10 +02:00
Rangi 52797b6f68 Implement SIZEOF("Section") and STARTOF("Section") (#766)
Updates the object file revision to 8

Fixes #765
2021-04-17 18:36:26 -04:00
Rangi 5108c5643c Let charmap_ConvertNext advance its output pointer 2021-04-17 18:18:34 -04:00
Rangi 2005ed1df9 Implement CHARLEN and CHARSUB
Fixes #786
2021-04-17 18:18:34 -04:00
Rangi d43408f4f3 Allow OPT to modify -W
Warning flags are processed individually;
PUSHO and POPO (re)store all the warning states.
2021-04-18 00:11:18 +02:00
Rangi 2c30ab8731 Allow OPT to modify -L
-L is a Boolean flag option, so you specify 'OPT L' or 'OPT !L'.
2021-04-18 00:11:18 +02:00
Rangi 9923fa3eee Fix expansions that start from the end of another expansion (#839)
Do not free an expansion until its offset is *past* its size.
This means potentially freeing a nested stack of expansions
all at once.

Fixes #696
2021-04-17 13:14:40 -04:00
Rangi 750e93be3d Further simplify formatting code
- Remove redundant length checks before `memcpy`
- Coerce `sign` and `prefix` to boolean for `numLen`
2021-04-17 01:11:11 -04:00
Rangi ee5da4468d Fix interpolation/STRFMT overflow issues (#838)
Widths and fractional widths greater than 255 would overflow a
uint8_t and wrap around to smaller values.

Total formatted lengths greater than the avilable buffer size
would overflow it and potentially corrupt memory.

Fixes #830
Closes #831
2021-04-17 00:52:55 -04:00
Rangi 503c3b5364 Revert "Fix interpolation/STRFMT overflow issues"
This reverts commit 992be3fd9b.
2021-04-16 22:19:37 -04:00
Rangi 992be3fd9b Fix interpolation/STRFMT overflow issues
Widths and fractional widths greater than 255 would overflow a
uint8_t and wrap around to smaller values.

Total formatted lengths greater than the avilable buffer size
would overflow it and potentially corrupt memory.

Fixes #830
Closes #831
2021-04-16 22:00:17 -04:00
Rangi c755fa3469 readIdentifier does not process characters that get truncated
Previously a '.' could be past the truncation limit but still
cause the identifier to be marked as local, violating an
assertion in `sym_AddLocalLabel`.

Fixes #832
2021-04-16 21:15:01 -04:00
Rangi e78a1d5bfd readInterpolation is limited by nMaxRecursionDepth
Fixes #837
2021-04-16 16:10:46 -04:00
Rangi d2f6def2eb Remove unused function hash_ReplaceElement 2021-04-16 12:36:45 -04:00
Jakub Kądziołka 215e26b478 charmap: Store hashmap nodes in charmap stack
This helps update all the pointers during reallocation.
2021-04-16 16:00:26 +02:00
Jakub Kądziołka 8885f7bcf6 hash_AddElement: return the new node 2021-04-16 16:00:26 +02:00
Jakub Kądziołka 5334fc334e Don't report hashmap collisions
This doesn't seem to be very useful, and keeping this "feature" is
difficult.
2021-04-16 16:00:26 +02:00
Jakub Kądziołka f97663aa37 hashmap: add hash_GetNode 2021-04-16 16:00:26 +02:00
Rangi 5c852c7651 Store the nested expansions starting from the deepest one (#829)
This shortens the lexer by 100 lines and simplifies
access to expansion contents, since it usually needs the
deepest one, not the top-level one.

Fixes #813
2021-04-16 09:54:13 -04:00
Rangi 6be3584467 LexerState's 'size' and 'offset' for mmapped files are unsigned
These were using signed 'off_t' because that is the type of
'st_size' from 'stat()', but neither one can be negative.
2021-04-16 10:23:37 +02:00
Rangi 8c90d9d2d7 Get rid of skip in struct Expansion
This was only used to skip the two macro arg characters,
but shiftChar() can skip them before the expansion.
2021-04-16 10:23:37 +02:00
Rangi f69e666b00 expansionOfs cannot be negative
lexerState->expansionOfs is always either set to 0, or updated by
adding a positive quantity:

    if (distance > lexerState->expansions->distance) {
        lexerState->expansionOfs += distance - lexerState->expansions->distance;
        ...
    }

so it will always be positive or zero.
2021-04-16 10:23:37 +02:00
Rangi eba06404f0 peek(0) => peek()
This does not completely refactor `peek` as #708 suggested,
to make it shift and cache a character itself. However it
does simplify the lexer code.
2021-04-16 10:23:37 +02:00
Rangi 9558ccea1b shiftChars(1) => shiftChar()
Only two sites were for distances greater than 1:
a `shiftChars(2)`, trivial to just do two `shiftChar()`s;
and `shiftChars(size)` in `reportGarbageChar`, which
can be a `for` loop, and should be fixed anyway to
"avoid having to peek further than 0".
2021-04-16 10:23:37 +02:00
Rangi 260d372acd Lex $ff00+c without needing large peek lookahead
This also allows arbitrary amounts of whitespace in `$ff00 + c`,
instead of needing to fit in the 42-byte LEXER_BUF_SIZE
2021-04-16 10:23:37 +02:00
Rangi 8fa5a4255e Mention alternative mnemonics in gbz80(7)
Fixes #819
2021-04-14 16:59:31 -04:00