Quoted Values and Tokenization
The lexer transforms raw user input into semantic tokens before parsing starts. Understanding this lexical layer is essential when values include spaces, punctuation, or apostrophes.
If a value must remain one atomic unit across spaces or significant punctuation, quote it.
Token categories
A compliant lexer distinguishes four functional categories:
WORDfor ordinary non-whitespace sequences that do not hit a recognized delimiter.STRINGfor values enclosed in single quotes.DELIMITERfor recognized punctuation and operator tokens.EOFfor the synthetic end-of-input marker appended by the lexer.
Whitespace rules
Whitespace separates tokens, but it has no standalone semantic meaning outside quoted strings. Spaces, tabs, and newlines matter only insofar as they influence token boundaries.
Inside a quoted string, whitespace is preserved as part of the semantic value.
Quoted values
A quoted value is written with single quotes. The outer quotes are not part of the bound semantic value.
'hello world'
'db-backup'
'March 2026 maintenance window'
Quoted values are especially important when a domain value contains spaces, punctuation, or delimiter characters that would otherwise be split into multiple tokens.
Escaped apostrophes
Inside a quoted string, a doubled apostrophe represents one literal apostrophe in the semantic value.
'O''Reilly'
The bound semantic value is O'Reilly.
Delimiters and atomicity
Delimiters such as ., ,, ;, :, (, ), and ... are recognized by the lexer outside quoted strings. Inside a quoted string, they remain ordinary content.
Unquoted input: report . pdf
Quoted input: 'report.pdf'
The first input yields multiple tokens. The second yields one semantic string value.
When quoting is mandatory
- When the value contains spaces and must stay one parameter.
- When the value contains significant delimiter characters that must not be split.
- When the value begins with text that could be confused with a later grammar keyword.
- When you want the bound value to preserve embedded punctuation exactly as one token.
Unclosed strings
An opening quote without a matching closing quote is a syntax error. The rich diagnostic should expose an expected possibility equivalent to a closing quote.
Examples
SET TITLE 'Quarterly Ops Review' ;
SET OWNER 'O''Reilly' ;
EXECUTE 'COMPLEX.TASK' ON db01 ;
SET DATE 07 . 03 . 2026 ;
Practical guidance
Grammar authors should test both quoted and unquoted input for any command that accepts free-form values. This is where most subtle lexical bugs appear.
Do not assume that punctuation-heavy business values will remain atomic when unquoted. Make the quoting contract explicit in your examples and tests.