Clarify comment

This commit is contained in:
Rangi42
2025-07-14 00:02:25 -04:00
parent eafc32fd68
commit 80df858ee3

View File

@@ -1976,18 +1976,20 @@ static Token yylex_NORMAL() {
} }
} }
// This is a "lexer hack"! We need it to distinguish between label definitions // We need it to distinguish between label definitions (which start with `LABEL`) and
// (which start with `LABEL`) and macro invocations (which start with `SYMBOL`). // macro invocations (which start with `SYMBOL`).
// //
// If we had one `IDENTIFIER` token, the parser would need to perform "lookahead" // If we had one `IDENTIFIER` token, the parser would need to perform "lookahead" to
// to determine which rule applies. But since macros need to enter "raw" mode to // determine which rule applies. But since macros need to enter "raw" mode to parse
// parse their arguments, which may not even be valid tokens in "normal" mode, we // their arguments, which may not even be valid tokens in "normal" mode, we cannot use
// cannot use lookahead to check for the presence of a `COLON`. // lookahead to check for the presence of a `COLON`.
// //
// Instead, we have separate `SYMBOL` and `LABEL` tokens, lexing as a `LABEL` if a // Instead, we have separate `SYMBOL` and `LABEL` tokens, lexing as a `LABEL` if a ':'
// ':' character *immediately* follows the identifier. Thus, at the beginning of a // character *immediately* follows the identifier. Thus, "Label:" and "mac:" are treated
// line, "Label:" and "mac:" are treated as label definitions, but "Label :" and // as label definitions, but "Label :" and "mac :" are treated as macro invocations.
// "mac :" are treated as macro invocations. //
// The alternative would be a "lexer hack" like C, where identifiers would lex as a
// `SYMBOL` if they are already defined, otherwise as a `LABEL`.
if (token.type == T_(SYMBOL) && peek() == ':') { if (token.type == T_(SYMBOL) && peek() == ':') {
token.type = T_(LABEL); token.type = T_(LABEL);
} }