Improve linker scripts a little (#1275)

* Allow for optional sections in linker scripts
  These are more useful for frameworks/toolchains.

* Check for an active mem region everywhere
  Do you like segfaults? Too bad!

* Allow the address to be floating in linker scripts
  Try and make the life of SDCC interop easier.

* Also validate alignment when floating

* Overhaul the linker script manual page
  Documenting the new features, but also restructuring the
  existing documentation to make the manual page (hopefully)
  easier to understand.
This commit is contained in:
Eldred Habert
2023-12-25 05:29:11 +01:00
committed by GitHub
parent 7b199d7550
commit ccf9dcb851
2 changed files with 248 additions and 115 deletions

View File

@@ -7,47 +7,41 @@
.Nm rgblink
.Nd linker script file format
.Sh DESCRIPTION
The linker script is an external file that allows the user to specify the order of sections at link time and in a centralized manner.
The linker script is a file that allows specifying attributes for sections at link time, and in a centralized manner.
There can only be one linker script per invocation of
.Nm ,
but it can be split into several files
.Pq using the Ic INCLUDE No directive .
.Ss Basic syntax
The linker script syntax is line-based.
Each line may have a directive or section name, a comment, both, or neither.
Whitespace (space and tab characters) is used to separate syntax elements, but is otherwise ignored.
.Pp
A linker script consists of a series of bank declarations, each optionally followed by a list of section names (in double quotes) or directives.
All reserved keywords (bank types and directive names) are case-insensitive; all section names are case-sensitive.
.Pp
Any line can contain a comment starting with
Comments begin with a semicolon
.Ql \&;
that ends at the end of the line.
.Bd -literal -offset indent
; This line is a comment
ROMX $F ; start a bank
"Some functions" ; a section name
ALIGN 8 ; a directive
"Some array"
WRAMX 2 ; start another bank
org $d123 ; another directive
"Some variables"
.Ed
character, until the end of the line.
They are simply ignored.
.Pp
Numbers can be in decimal or hexadecimal format
.Pq the prefix is Ql $ .
It is an error if any section name or directive is found before setting a bank.
Keywords are composed of letters and digits (but they can't start with a digit); they are all case-insensitive.
.Pp
Files can be included by using the
.Ic INCLUDE
keyword, followed by a string with the path of the file that has to be included.
.Pp
The possible bank types are:
.Cm ROM0 , ROMX , VRAM , SRAM , WRAM0 , WRAMX , OAM
and
.Cm HRAM .
Unless there is a single bank, which can occur with types
.Cm ROMX , VRAM , SRAM
and
.Cm WRAMX ,
it is mandatory to specify a bank number after the type.
.Pp
Section names in double quotes support the same character escape sequences as strings in
Numbers can be written in decimal format, or in binary using the
.Ql %
prefix, or in hexadecimal using the
.Ql $
prefix (hexadecimal digits are case-insensitive).
Note that unlike
.Xr rgbasm 5 ,
specifically
an octal
.Ql &
prefix is not supported, nor are
.Ql _
digit separators.
.Pp
Strings begin with a double quote, and end at the next (non-escaped) double quote.
Strings must not contain literal newline characters.
Most of the same character escapes as
.Xr rgbasm 5
are supported, specifically
.Ql \e\e ,
.Ql \e" ,
.Ql \en ,
@@ -56,51 +50,113 @@ and
.Ql \et .
Other backslash escape sequences in
.Xr rgbasm 5
are only relevant to assembly code and do not apply in section names.
.Pp
When a new bank statement is found, sections found after it will be placed right from the beginning of that bank.
If the linker script switches to a different bank and then comes back to a previous one, it will continue from the last address that was used.
.Pp
The only three directives are
.Ic ORG ,
.Ic ALIGN ,
are only relevant to assembly code and do not apply in linker scripts.
.Ss Directives
.Bl -tag -width Ds
.It Including other files
.Ql Ic INCLUDE Ar path
acts as if the contents of the file at
.Ar path
were copy-pasted in place of the
.Ic INCLUDE
directive.
.Ar path
must be a string.
.It Specifying the active bank
.Tg region
The active bank can be set by specifying its type (memory region) and number.
The possible types are:
.Ic ROM0 , ROMX , VRAM , SRAM , WRAM0 , WRAMX , OAM ,
and
.Ic DS :
.Bl -bullet
.It
.Ic ORG Ar addr
sets the address in which new sections will be placed to
.Ic HRAM .
The bank number can be omitted from the types that only contain a single bank, which are:
.Ic ROM0 ,
.Ic ROMX No if Fl t No is passed to Xr rgblink 1 ,
.Ic VRAM No if Fl d No is passed to Xr rgblink 1 ,
.Ic WRAM0 ,
.Ic WRAMX No if Fl w No is passed to Xr rgblink 1 ,
.Ic OAM ,
and
.Ic HRAM .
.Pq Ic SRAM No is the only type that can never have its bank number omitted.
.Pp
After a bank specification, the
.Dq current address
is set to the last value it had for that bank.
If the bank has never been active thus far, the
.Dq current address
defaults to the beginning of the bank
.Pq e.g. Ad $4000 No for Ic ROMX No sections .
.It Changing the current address
A bank must be active for any of these directives to be used.
.Pp
.Ql Ic ORG Ar addr
sets the
.Dq current address
to
.Ar addr .
It can not be lower than the current address.
.It
.Ic ALIGN Ar addr
or
.Ic ALIGN Ar addr , Ar offset
will increase the address until it is aligned to the specified boundary
.Po it tries to set to
.Ar offset
the number of bits specified by
.Ar align :
for example,
.Ql ALIGN 8
will align to $100 ,
and
.Ql ALIGN 8 , 10
will align to $10A
.Pc .
.It
.Ic DS
will increase the address by the specified non-negative amount.
.El
This directive cannot be used to move the address backwards:
.Ar addr
must be greater than or equal to the
.Dq current address .
.Pp
.Sy Note :
The bank, alignment, address and type of sections can be specified both in the source code and in the linker script.
For a section to be able to be placed with the linker script, the bank, address and alignment must be left unassigned in the source code or be compatible with what is specified in the linker script.
For example,
.Ql ALIGN[8]
in the source code is compatible with
.Ql ORG $F00
in the linker script.
.Ql Ic FLOATING
causes all sections between it and the next
.Ic ORG
or bank specification to be placed at addresses automatically determined by
.Nm .
.Pp
.Ql Ic ALIGN Ar addr , Ar offset
increases the
.Dq current address
until it is aligned to the specified boundary (i.e. the
.Ar align
lowest bits of the address are equal to
.Ar offset ) .
If
.Ar offset
is omitted, it is implied to be 0.
For example, if the
.Dq current address
is $0007,
.Ql ALIGN 8
would set it to $0100, and
.Ql ALIGN 8 , 10
would set it to $000A.
.Pp
.Ql Ic DS Ar size
increases the
.Dq current address
by
.Ar size .
The gap is not allocated, so smaller floating sections can later be placed there.
.El
.Ss Section placement
A section can be placed simply by naming it (with a string).
Its bank is set to the active bank, and its address to the
.Dq current address .
Any constraints the section already possesses (whether from earlier in the linker script, or from the object files being linked) must be consistent with what the linker script specifies: the section's type must match, the section's bank number (if set) must match the active bank, etc.
In particular, if the section has an alignment constraint, the address at which it is placed by the linker script must obey that constraint; otherwise, an error will occur.
.Pp
After a section is placed, the
.Dq current address
is increased by the section's size.
This must not increase it past the end of the active memory region.
.Pp
The section must have been defined in the object files being linked, unless the section name is followed by the keyword
.Ic OPTIONAL .
.Sh EXAMPLES
.Bd -literal -offset indent
; This line contains only a comment
ROMX $F ; start a bank
"Some functions" ; a section name
ALIGN 8 ; a directive
"Some \e"array\e""
WRAMX 2 ; start another bank
org $d123 ; another directive
"Some variables"
.Ed
.Sh SEE ALSO
.Xr rgbasm 1 ,
.Xr rgbasm 5 ,

View File

@@ -37,9 +37,10 @@
static void setSectionType(SectionType type);
static void setSectionType(SectionType type, uint32_t bank);
static void setAddr(uint32_t addr);
static void makeAddrFloating(void);
static void alignTo(uint32_t alignment, uint32_t offset);
static void pad(uint32_t length);
static void placeSection(std::string const &name);
static void placeSection(std::string const &name, bool isOptional);
static yy::parser::symbol_type yylex(void);
@@ -53,21 +54,27 @@
%token newline
%token COMMA ","
%token ORG "ORG"
FLOATING "FLOATING"
INCLUDE "INCLUDE"
ALIGN "ALIGN"
DS "DS"
OPTIONAL "OPTIONAL"
%code {
static std::array keywords{
Keyword{"ORG"sv, yy::parser::make_ORG},
Keyword{"INCLUDE"sv, yy::parser::make_INCLUDE},
Keyword{"ALIGN"sv, yy::parser::make_ALIGN},
Keyword{"DS"sv, yy::parser::make_DS},
Keyword{"ORG"sv, yy::parser::make_ORG},
Keyword{"FLOATING"sv, yy::parser::make_FLOATING},
Keyword{"INCLUDE"sv, yy::parser::make_INCLUDE},
Keyword{"ALIGN"sv, yy::parser::make_ALIGN},
Keyword{"DS"sv, yy::parser::make_DS},
Keyword{"OPTIONAL"sv, yy::parser::make_OPTIONAL},
};
}
%token <std::string> string;
%token <uint32_t> number;
%token <SectionType> section_type;
%type <bool> optional;
%%
lines: %empty
@@ -82,11 +89,16 @@ line: INCLUDE string newline { includeFile(std::move($2)); } // Note: this addit
directive: section_type { setSectionType($1); }
| section_type number { setSectionType($1, $2); }
| FLOATING { makeAddrFloating(); }
| ORG number { setAddr($2); }
| ALIGN number { alignTo($2, 0); }
| ALIGN number COMMA number { alignTo($2, $4); }
| DS number { pad($2); }
| string { placeSection($1); }
| string optional { placeSection($1, $2); }
;
optional: %empty { $$ = false; }
| OPTIONAL { $$ = true; }
;
%%
@@ -318,10 +330,14 @@ try_again: // Can't use a `do {} while(0)` loop, otherwise compilers (wrongly) t
static std::array<std::vector<uint16_t>, SECTTYPE_INVALID> curAddr;
static SectionType activeType; // Index into curAddr
static uint32_t activeBankIdx; // Index into curAddr[activeType]
static bool isPcFloating;
static uint16_t floatingAlignMask;
static uint16_t floatingAlignOffset;
static void setActiveTypeAndIdx(SectionType type, uint32_t idx) {
activeType = type;
activeBankIdx = idx;
isPcFloating = false;
if (curAddr[activeType].size() <= activeBankIdx) {
curAddr[activeType].resize(activeBankIdx + 1, sectionTypeInfo[type].startAddr);
}
@@ -344,11 +360,11 @@ static void setSectionType(SectionType type, uint32_t bank) {
auto const &typeInfo = sectionTypeInfo[type];
if (bank < typeInfo.firstBank) {
scriptError(context, "%s bank %" PRIu32 " doesn't exist, the minimum is %" PRIu32,
scriptError(context, "%s bank %" PRIu32 " doesn't exist (the minimum is %" PRIu32 ")",
typeInfo.name.c_str(), bank, typeInfo.firstBank);
bank = typeInfo.firstBank;
} else if (bank > typeInfo.lastBank) {
scriptError(context, "%s bank %" PRIu32 " doesn't exist, the maximum is %" PRIu32,
scriptError(context, "%s bank %" PRIu32 " doesn't exist (the maximum is %" PRIu32 ")",
typeInfo.name.c_str(), bank, typeInfo.lastBank);
}
@@ -357,22 +373,64 @@ static void setSectionType(SectionType type, uint32_t bank) {
static void setAddr(uint32_t addr) {
auto const &context = lexerStack.back();
if (activeType == SECTTYPE_INVALID) {
scriptError(context, "Cannot set the current address: no memory region is active");
return;
}
auto &pc = curAddr[activeType][activeBankIdx];
auto const &typeInfo = sectionTypeInfo[activeType];
if (addr < pc) {
scriptError(context, "ORG cannot be used to go backwards (from $%04x to $%04x)", pc, addr);
scriptError(context, "Cannot decrease the current address (from $%04x to $%04x)", pc, addr);
} else if (addr > endaddr(activeType)) { // Allow "one past the end" sections.
scriptError(context, "Cannot go to $%04" PRIx32 ": %s ends at $%04" PRIx16 "",
scriptError(context, "Cannot set the current address to $%04" PRIx32 ": %s ends at $%04" PRIx16 "",
addr, typeInfo.name.c_str(), endaddr(activeType));
pc = endaddr(activeType);
} else {
pc = addr;
}
isPcFloating = false;
}
static void makeAddrFloating(void) {
auto const &context = lexerStack.back();
if (activeType == SECTTYPE_INVALID) {
scriptError(context, "Cannot make the current address floating: no memory region is active");
return;
}
isPcFloating = true;
floatingAlignMask = 0;
floatingAlignOffset = 0;
}
static void alignTo(uint32_t alignment, uint32_t alignOfs) {
auto const &context = lexerStack.back();
if (activeType == SECTTYPE_INVALID) {
scriptError(context, "Cannot align: no memory region is active");
return;
}
if (isPcFloating) {
if (alignment >= 16) {
setAddr(floatingAlignOffset);
} else {
uint32_t alignSize = 1u << alignment;
if (alignOfs >= alignSize) {
scriptError(context, "Cannot align: The alignment offset (%" PRIu32
") must be less than alignment size (%" PRIu32 ")\n",
alignOfs, alignSize);
return;
}
floatingAlignMask = alignSize - 1;
floatingAlignOffset = alignOfs % alignSize;
}
return;
}
auto const &typeInfo = sectionTypeInfo[activeType];
auto &pc = curAddr[activeType][activeBankIdx];
@@ -391,7 +449,7 @@ static void alignTo(uint32_t alignment, uint32_t alignOfs) {
if (alignOfs >= alignSize) {
scriptError(context, "Cannot align: The alignment offset (%" PRIu32
") must be less than alignment size (%" PRIu32 ")\n",
alignOfs, 1 << alignment);
alignOfs, alignSize);
return;
}
@@ -411,23 +469,30 @@ static void alignTo(uint32_t alignment, uint32_t alignOfs) {
static void pad(uint32_t length) {
auto const &context = lexerStack.back();
if (activeType == SECTTYPE_INVALID) {
scriptError(context, "Cannot increase the current address: no memory region is active");
return;
}
if (isPcFloating) {
floatingAlignOffset = (floatingAlignOffset + length) & floatingAlignMask;
return;
}
auto const &typeInfo = sectionTypeInfo[activeType];
auto &pc = curAddr[activeType][activeBankIdx];
assert(pc >= typeInfo.startAddr);
if (uint16_t offset = pc - typeInfo.startAddr; length + offset > typeInfo.size) {
scriptError(context, "Cannot pad by %u bytes: only %u bytes to $%04" PRIx16,
scriptError(context, "Cannot increase the current address by %u bytes: only %u bytes to $%04" PRIx16,
length, typeInfo.size - offset, (uint16_t)(endaddr(activeType) + 1));
} else {
pc += length;
}
}
static void placeSection(std::string const &name) {
static void placeSection(std::string const &name, bool isOptional) {
auto const &context = lexerStack.back();
auto const &typeInfo = sectionTypeInfo[activeType];
// A type *must* be active.
if (activeType == SECTTYPE_INVALID) {
scriptError(context, "No memory region has been specified to place section \"%s\" in",
name.c_str());
@@ -436,10 +501,13 @@ static void placeSection(std::string const &name) {
auto *section = sect_GetSection(name.c_str());
if (!section) {
scriptError(context, "Unknown section \"%s\"", name.c_str());
if (!isOptional) {
scriptError(context, "Unknown section \"%s\"", name.c_str());
}
return;
}
auto const &typeInfo = sectionTypeInfo[activeType];
assert(section->offset == 0);
// Check that the linker script doesn't contradict what the code says.
if (section->type == SECTTYPE_INVALID) {
@@ -460,28 +528,37 @@ static void placeSection(std::string const &name) {
section->isBankFixed = true;
section->bank = bank;
uint16_t &org = curAddr[activeType][activeBankIdx];
if (section->isAddressFixed && org != section->org) {
scriptError(context, "The linker script assigns section \"%s\" to address $%04" PRIx16 ", but it was already at $%04" PRIx16,
name.c_str(), org, section->org);
} else if (section->isAlignFixed && (org & section->alignMask) != section->alignOfs) {
uint8_t alignment = std::countr_one(section->alignMask);
scriptError(context, "The linker script assigns section \"%s\" to address $%04" PRIx16 ", but that would be ALIGN[%" PRIu8 ", %" PRIu16 "] instead of the requested ALIGN[%" PRIu8 ", %" PRIu16 "]",
name.c_str(), org, alignment, (uint16_t)(org & section->alignMask), alignment, section->alignOfs);
}
section->isAddressFixed = true;
section->isAlignFixed = false; // This can't be set when the above is.
section->org = org;
if (!isPcFloating) {
uint16_t &org = curAddr[activeType][activeBankIdx];
if (section->isAddressFixed && org != section->org) {
scriptError(context, "The linker script assigns section \"%s\" to address $%04" PRIx16 ", but it was already at $%04" PRIx16,
name.c_str(), org, section->org);
} else if (section->isAlignFixed && (org & section->alignMask) != section->alignOfs) {
uint8_t alignment = std::countr_one(section->alignMask);
scriptError(context, "The linker script assigns section \"%s\" to address $%04" PRIx16 ", but that would be ALIGN[%" PRIu8 ", %" PRIu16 "] instead of the requested ALIGN[%" PRIu8 ", %" PRIu16 "]",
name.c_str(), org, alignment, (uint16_t)(org & section->alignMask), alignment, section->alignOfs);
}
section->isAddressFixed = true;
section->isAlignFixed = false; // This can't be set when the above is.
section->org = org;
uint16_t curOfs = org - typeInfo.startAddr;
if (section->size > typeInfo.size - curOfs) {
scriptError(context, "The linker script assigns section \"%s\" to address $%04" PRIx16 ", but then it would overflow %s by %" PRIx16 " bytes",
name.c_str(), org, typeInfo.name.c_str(),
(uint16_t)(section->size - (typeInfo.size - curOfs)));
// Fill as much as possible without going out of bounds.
org = typeInfo.startAddr + typeInfo.size;
uint16_t curOfs = org - typeInfo.startAddr;
if (section->size > typeInfo.size - curOfs) {
scriptError(context, "The linker script assigns section \"%s\" to address $%04" PRIx16 ", but then it would overflow %s by %" PRIx16 " bytes",
name.c_str(), org, typeInfo.name.c_str(),
(uint16_t)(section->size - (typeInfo.size - curOfs)));
// Fill as much as possible without going out of bounds.
org = typeInfo.startAddr + typeInfo.size;
} else {
org += section->size;
}
} else {
org += section->size;
section->isAddressFixed = false;
section->isAlignFixed = floatingAlignMask != 0;
section->alignMask = floatingAlignMask;
section->alignOfs = floatingAlignOffset;
floatingAlignOffset = (floatingAlignOffset + section->size) & floatingAlignMask;
}
}