Commit Graph

216 Commits

Author SHA1 Message Date
Eldred Habert
22a6a82642 Merge pull request #419 from dbrotz/fix-blackslash-tab-at-eof
Handle tabs after backslash at end of file
2019-09-23 00:05:21 +02:00
dbrotz
f36a3d5b2a Fix macro and rept buffer overflows
Macro and rept buffers were not always being terminated with newlines
and/or were vulnerable to the final newline being escaped, allowing
buffer overflows to occur. Now, they are terminated with newlines using
the same mechanism as the file buffer.
2019-09-10 03:03:04 -07:00
dbrotz
c5e8e4ff83 Reject input that contains null characters
Null characters in the middle of strings interact badly with the RGBDS
codebase, which assumes null-terminated strings. There is no reason to
support null characters in input source code, so the simplest way to deal
with null characters is to reject them early.
2019-09-09 17:27:56 -07:00
dbrotz
89eda89838 Handle tabs after backslash at end of file
Commit 6fbb25c added support for tabs between a \ and the newline it escapes,
but yy_create_buffer() was not updated to handle tabs.
2019-09-09 12:25:26 -07:00
ISSOtm
6fbb25c0da Clean up lexer.c
Remove some hardcoded character values
Allow tabs to be used for line continuations
2019-09-05 15:22:24 +02:00
ISSOtm
476ccc9f6b Fix undefined behavior in yyunputstr
Refer to comment at lexer.c:100 for more info
2019-09-02 02:09:59 +02:00
ISSOtm
e0e8170fe6 Add recursion limit for string expansions
Unlike macros, REPTs and INCLUDEs, this recursion depth is independent.
This is intentional, because string expansions work very differently.

While it's easy to know when a string expansion begins, checking where it
ends is much more complicated, since the expansion's contents are simply
injected back into the lex buffer. Therefore, the depth has to be checked
after lexing took place.
Because of this, the placement of the expansion end check is somewhat
haphazard, but I think it's good. While I have no certainty, all tests
ended with all expansions properly ended, and I couldn't find any pitfalls.

Finally, `pCurrentStringExpansion` has been made global so error printing
can use it to tell the user if an error occurred inside of an expansion.
2019-08-31 15:50:08 +02:00
ISSOtm
dc2c97fe0c Comment and improve ParseSymbol and AppendMacroArg 2019-08-31 02:31:46 +02:00
ISSOtm
64752da42d Add "print types" to bracketed symbols
Should partially cover #178 and close #270.
This allows printing numbers in different bases and without the dollar prefix
This is especially useful in macros because the dollar isnt a valid character
for symbol names, requiring heavy `STRSUB` usage.
2019-08-29 14:04:58 +02:00
Antonio Niño Díaz
88b66f2941 Merge pull request #364 from dbrotz/fix-362
Don't append invalid characters to symbol name
2019-08-17 16:08:27 +01:00
Antonio Niño Díaz
4040555532 Merge pull request #365 from dbrotz/terminate-bracketed-symbol
Terminate standalone bracketed symbol strings
2019-07-07 11:46:43 +01:00
Antonio Niño Díaz
dfdb107105 Merge pull request #370 from jidoc01/fix_bug
Fix comment bug
2019-07-06 11:10:17 +01:00
jidoc01
38110a833d Fix comment bug
There is a bug in processing the comments in source files. It's
related to #326. And this bug comes out when you comment something
with the character ';', and include the quotation mark without its
pair in it.

The lastest version of rgbds compiler has a step to parse the given
source to convert its line endings to a unified one, and it
processes quotation marks even before it processes the comments.

I edited a little bit of the source, and it works fine now.
2019-07-05 13:48:24 +09:00
dbrotz
484d15dbb2 Handle unprintable characters more gracefully
* Skip UTF-8 byte order mark at beginning of file
* Error on other unexpected unprintable characters
2019-07-04 17:14:55 -07:00
dbrotz
1decf5d0d4 Fix out of bounds array access in lexer
If the type char is signed, then in the function
yylex_GetFloatMaskAndFloatLen(), *s can have a negative value and be converted
to a negative int32_t which is then used as an array index. It should be
converted to uint8_t instead to ensure that the value is in the bounds of the
tFloatingFirstChar, tFloatingSecondChar, and tFloatingChars arrays.
2019-07-04 17:01:29 -07:00
dbrotz
c75a9539ba Don't append invalid characters to symbol name
When a macro arg appears in a symbol name, the contents are appended.
However, the contents of the macro arg were not being validated.
Any character, regardless of whether it was allowed in a symbol name,
would be appended. With this change, the contents of the macro arg
are now validated character by character. The symbol name is considered
to end at the last valid character. The remainder of the macro arg is
treated as though it followed the symbol name in the asm source code.
2019-07-04 16:34:47 -07:00
dbrotz
b3120aea25 Terminate standalone bracketed symbol strings
Standalone bracketed symbols like the following weren't being zero-terminated.

X EQUS {Y}

This doesn't apply to bracketed symbols that aren't standalone, but are
instead found in a string. For example, the following works even without this
fix.

X EQUS "{Y}"
2019-07-04 16:01:57 -07:00
Jakub Kądziołka
df15c97b6e Handle zero-byte files gracefully 2019-07-03 16:38:35 +02:00
Jakub Kądziołka
0d97b58265 Avoid potentially implementation-defined behavior when using a pipe as input 2019-07-03 16:38:00 +02:00
Jakub Kądziołka
8d5a53f529 Handle non-seekable input correctly 2019-07-03 15:38:14 +02:00
dbrotz
40006c6152 Make yylex() return int 2019-05-02 19:53:45 -07:00
Antonio Niño Díaz
4b40d63dfd Merge pull request #311 from dbrotz/fix-222
Fix #222 and #255
2018-12-10 23:17:39 +00:00
dbrotz
6c1ec59a5b Use separate function to append newlines 2018-12-05 01:32:06 -08:00
dbrotz
a060f135b8 Only add newlines to file if necessary 2018-12-02 20:43:20 -08:00
dbrotz
3806eb3139 Fix ambiguity in const parsing 2018-12-02 13:49:12 -08:00
dbrotz
bad66e54fa Fix buffer overflow when file ends with \ 2018-12-01 07:21:25 -08:00
Antonio Niño Díaz
340362d984 Enable a few warning flags
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-04-02 22:53:48 +01:00
Antonio Niño Díaz
85ece88268 Add default clauses to switch statements
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-04-02 22:53:43 +01:00
Antonio Niño Díaz
b28a16c0da Enable -Wpedantic
Fix a few warnings related needed to build the source with this option.

Add new exception to .checkpatch.conf.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-04-01 00:56:00 +01:00
Antonio Niño Díaz
0c85240b97 Allow line continuations in list of macro args
For example:

    PrintMacro : MACRO
        PRINTT \1
    ENDM

        PrintMacro STRCAT(\"Hello\"\,  \
                          \" world\\n\")

It is possible to have spaces after the '\' and before the newline
character. This is needed because Windows line endings "\r\n" are
converted to " \n" before the lexer has a chance to handle them.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-02-26 21:47:52 +00:00
Antonio Niño Díaz
58ab88da82 Allow to scape " in lists of macro args
For example:

    PrintMacro : MACRO
        PRINTT \1
    ENDM

        PrintMacro STRCAT(\"Hello\"\,  \" world\\n\")

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-02-26 21:47:52 +00:00
Antonio Niño Díaz
3e219dee36 Allow to continuate lines except inside macros
Lines can be continuated after a newline character ('\n'):

    DB 1, 2, 3, 4 \
       5, 6, 7, 8

This doesn't work for now in lists of arguments of macros.

It is possible to have spaces after the '\' and before the newline
character. This is needed because Windows line endings "\r\n" are
converted to " \n" before the lexer has a chance to handle them.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-02-26 21:47:52 +00:00
Antonio Niño Díaz
3bebedf1f8 Handle newlines and comments correctly
Newlines have to be handled before comments or comments won't be able to
handle line endings that don't include at least one LF character.

Also, document an obscure comment syntax: Anything that follows a '*'
placed at the start of a line is also a comment until the end of the
line.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-02-23 19:24:18 +00:00
Antonio Niño Díaz
1a5c423984 Relicense codebase under MIT license
With permission from the main authors [1], most of the code has been
relicensed under the MIT license.

SPDX license identifiers are used so that the license headers in source
code files aren't too large.

Add CONTRIBUTORS.rst file.

[1] https://github.com/rednex/rgbds/issues/128

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-01-26 22:59:02 +00:00
Antonio Niño Díaz
b04596a32b Move externs to header files
Follow Linux kernel coding style.

Remove exception from checkpatch.pl configuration file.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-01-04 01:28:23 +00:00
Antonio Niño Díaz
72f801283d Cleanup code of rgbasm
Follow Linux kernel coding style.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2018-01-02 17:09:36 +01:00
Antonio Niño Díaz
ec76431c51 Replace C types by stdint.h types
Not all occurrences have been replaced, in some cases they have been
left as they were before (like in rgbgfx and when they are in the
interface of a C standard library function).

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2017-12-31 15:46:22 +01:00
Antonio Niño Díaz
ba944527ec Replace ULONG by uint32_t
All affected `printf` have been fixed.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2017-12-31 15:16:08 +01:00
Antonio Niño Díaz
13c0684497 Replace 8 and 16 bit custom types by stdint.h types
Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2017-12-31 15:16:08 +01:00
Antonio Niño Díaz
ff2321a8ce Make fatalerror and yyerror consistent
There are two ways in which the assembly process can fail:

1. If there is a really big problem that compromises the whole process,
   the assembler has to stop right there and generate an error message.
   This happens with unterminated REPT loops, macros, etc.

2. If the problem isn't that big and the process can still continue,
   even though the final result is invalid, the assembler can try to
   continue and warn the user about all errors it finds in the code.

This patch clarifies the use of each function and replaces the function
used in two places by the correct one.

Signed-off-by: Antonio Niño Díaz <antonio_nd@outlook.com>
2017-04-29 15:02:57 +01:00
AntonioND
5f299bfe6c Fix whitespace
Replace spaces by tabs for consistency. The rest of the codebase uses
tabs, so the linkerscript parser has to change.

Removed trailing tabs in all codebase.

Signed-off-by: AntonioND <antonio_nd@outlook.com>
2017-04-04 22:14:46 +01:00
AntonioND
e50e3e5a23 Remove trailing whitespace
Signed-off-by: AntonioND <antonio_nd@outlook.com>
2017-04-02 17:46:14 +01:00
AntonioND
43fd1ee024 Fix some signed/unsigned comparison warnings
Signed-off-by: AntonioND <antonio_nd@outlook.com>
2017-04-02 17:08:12 +01:00
AntonioND
0867476bde Allow ',' to be escaped in string literals
It should only be needed for macro arguments, added to string parsing
function as well for consistency.
2017-03-19 00:05:50 +00:00
Christophe Staïesse
b8642bf3af Allow { and } to be escaped in string literals
As stated in the documentation but that was not actually implemented.
2017-03-18 16:23:41 +01:00
Anthony J. Bentley
6e0aca47d4 Declare string uppercase/lowercase functions unconditionally.
Avoid naming them str*(), because such names are reserved by ISO C.
2016-09-05 01:41:39 -06:00
stag019
ebc9a4b786 Merge include/link/types.h and include/asm/types.h into include/types.h 2015-03-07 16:04:07 -05:00
Anthony J. Bentley
1e1339467e Use POSIX 2001 as the base standard. 2014-11-06 21:39:36 -07:00
stag019
80e2129f22 Merge https://github.com/bentley/rgbds
Conflicts:
	include/lib/types.h
	src/asm/symbol.c
2014-11-02 01:00:20 -05:00
Christophe Staïesse
25efb00769 fix a bug in the lexer involving double quote escaping and semicolons
The bug showed up when a semicolon was located anywhere after \".

These three test cases are syntaxically correct but didn't compile:

1)
SECTION "HOME", HOME
	db "\";"

2)
SECTION "HOME", HOME
	db "\""
	nop
	;

3)
SECTION "HOME", HOME
	db "\"" ;

The problem was located in yy_create_buffer(). Basicaly, this function loads an
entire source file, uniformizes EOL terminators and filters out comments without
touching literal strings.

However, bounds of literal strings were wrongly guessed because \" was
interpreted as two characters (and so the double quote was not escaped).

In test 1, the string terminates early and so ;" is filtered out as it was a
comment and so the assembler complains of an unterminated string.
In test 2 and 3, the string is in fact interpreted as two strings, the second
one terminates at EOF in these cases and so comments are not filtered out and
that makes the assembler complains.

A special case must be taken into account:

4)
SECTION "HOME", HOME
	db "\\" ;

So we need to ignore \\ as well.

Note that there is still a problem left: in yy_create_buffer() a string may
span multiple lines but not in the lexer. However in this case I think the lexer
would quit at the first newline so there should be nothing to worry about.
2014-10-10 16:50:11 +02:00