If the type char is signed, then in the function
yylex_GetFloatMaskAndFloatLen(), *s can have a negative value and be converted
to a negative int32_t which is then used as an array index. It should be
converted to uint8_t instead to ensure that the value is in the bounds of the
tFloatingFirstChar, tFloatingSecondChar, and tFloatingChars arrays.
When a macro arg appears in a symbol name, the contents are appended.
However, the contents of the macro arg were not being validated.
Any character, regardless of whether it was allowed in a symbol name,
would be appended. With this change, the contents of the macro arg
are now validated character by character. The symbol name is considered
to end at the last valid character. The remainder of the macro arg is
treated as though it followed the symbol name in the asm source code.
Standalone bracketed symbols like the following weren't being zero-terminated.
X EQUS {Y}
This doesn't apply to bracketed symbols that aren't standalone, but are
instead found in a string. For example, the following works even without this
fix.
X EQUS "{Y}"
Fix a few warnings related needed to build the source with this option.
Add new exception to .checkpatch.conf.
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
For example:
PrintMacro : MACRO
PRINTT \1
ENDM
PrintMacro STRCAT(\"Hello\"\, \
\" world\\n\")
It is possible to have spaces after the '\' and before the newline
character. This is needed because Windows line endings "\r\n" are
converted to " \n" before the lexer has a chance to handle them.
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
Lines can be continuated after a newline character ('\n'):
DB 1, 2, 3, 4 \
5, 6, 7, 8
This doesn't work for now in lists of arguments of macros.
It is possible to have spaces after the '\' and before the newline
character. This is needed because Windows line endings "\r\n" are
converted to " \n" before the lexer has a chance to handle them.
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
Newlines have to be handled before comments or comments won't be able to
handle line endings that don't include at least one LF character.
Also, document an obscure comment syntax: Anything that follows a '*'
placed at the start of a line is also a comment until the end of the
line.
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
With permission from the main authors [1], most of the code has been
relicensed under the MIT license.
SPDX license identifiers are used so that the license headers in source
code files aren't too large.
Add CONTRIBUTORS.rst file.
[1] https://github.com/rednex/rgbds/issues/128
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
Not all occurrences have been replaced, in some cases they have been
left as they were before (like in rgbgfx and when they are in the
interface of a C standard library function).
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
There are two ways in which the assembly process can fail:
1. If there is a really big problem that compromises the whole process,
the assembler has to stop right there and generate an error message.
This happens with unterminated REPT loops, macros, etc.
2. If the problem isn't that big and the process can still continue,
even though the final result is invalid, the assembler can try to
continue and warn the user about all errors it finds in the code.
This patch clarifies the use of each function and replaces the function
used in two places by the correct one.
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
Replace spaces by tabs for consistency. The rest of the codebase uses
tabs, so the linkerscript parser has to change.
Removed trailing tabs in all codebase.
Signed-off-by: AntonioND <antonio_nd@outlook.com>
The bug showed up when a semicolon was located anywhere after \".
These three test cases are syntaxically correct but didn't compile:
1)
SECTION "HOME", HOME
db "\";"
2)
SECTION "HOME", HOME
db "\""
nop
;
3)
SECTION "HOME", HOME
db "\"" ;
The problem was located in yy_create_buffer(). Basicaly, this function loads an
entire source file, uniformizes EOL terminators and filters out comments without
touching literal strings.
However, bounds of literal strings were wrongly guessed because \" was
interpreted as two characters (and so the double quote was not escaped).
In test 1, the string terminates early and so ;" is filtered out as it was a
comment and so the assembler complains of an unterminated string.
In test 2 and 3, the string is in fact interpreted as two strings, the second
one terminates at EOF in these cases and so comments are not filtered out and
that makes the assembler complains.
A special case must be taken into account:
4)
SECTION "HOME", HOME
db "\\" ;
So we need to ignore \\ as well.
Note that there is still a problem left: in yy_create_buffer() a string may
span multiple lines but not in the lexer. However in this case I think the lexer
would quit at the first newline so there should be nothing to worry about.
A reference to an invalid macro argument (\ not followed by a digit
between 1 and 9) will cause an access outside of the bounds of the
currentmacroargs array in sym_FindMacroArg().
Macro arg references are processed in two places:
In CopyMacroArg(): called when scanning tokens between "", {} and
arguments of a macro call. The only problem here is that it accepts \0
as valid and so calls sym_FindMacroArg with a invalid value.
In PutMacroArg(): called by the lexer automata when it encounters a
token matching \\[0-9]? (in other cases than above). So not only it
accepts \0 but also \ alone.
Memo: In setuplex(), a rule is defined with a regex composed of up to
three ranges of chars and takes the form:
[FirstRange]
or [FirstRange][SecondRange]?
or [FirstRange]([SecondRange][Range]*)?
On scanning, when several rules match, the first longuest one is
choosen.
Regression test:
1)
SECTION "HOME", HOME
db "\0"
2)
SECTION "HOME", HOME
db \A
3)
SECTION "HOME", HOME
db \
- separated the lexer into multiple functions so it is more readable
- fixed issue with long label names in macro arguments
- added error checking code to prevent buffer overflows
On Linux, valgrind complains about the overflow like this:
Pass 1...
==20054== Invalid read of size 1
==20054== at 0x406CDA: yylex (lexer.c:396)
==20054== by 0x40207C: yyparse (asmy.c:2921)
==20054== by 0x4086AF: main (main.c:351)
==20054== Address 0x503a102 is 0 bytes after a block of size 23,538 alloc'd
==20054== at 0x402994D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20054== by 0x406411: yy_create_buffer (lexer.c:147)
==20054== by 0x404FE3: fstk_RunInclude (fstack.c:243)
==20054== by 0x4025F5: yyparse (asmy.y:744)
==20054== by 0x4086AF: main (main.c:351)
==20054==
This is a bit of a crude fix which simply exits the hashing loop when
we reach the end of the string. We should probably do some kind of
length calculation on the buffer instead.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Merging lai's source with this one is very irritating because
they have different indentation styles. I couldn't find what profile
vegard used for his version, so I used these flags (which should bring
the source close to KNF):
-bap
-br
-ce
-ci4
-cli0
-d0
-di0
-i8
-ip
-l79
-nbc
-ncdb
-ndj
-ei
-nfc1
-nlp
-npcs
-psl
-sc
-sob