Skip to content

fix(dsl): avoid panic on unterminated string escape at EOF#3024

Open
SAY-5 wants to merge 1 commit into
Permify:masterfrom
SAY-5:fix-lexer-unterminated-escape-panic
Open

fix(dsl): avoid panic on unterminated string escape at EOF#3024
SAY-5 wants to merge 1 commit into
Permify:masterfrom
SAY-5:fix-lexer-unterminated-escape-panic

Conversation

@SAY-5

@SAY-5 SAY-5 commented Jul 2, 2026

Copy link
Copy Markdown

Fixes #3004.

When the schema lexer reaches a backslash that is the final character of the input (an unterminated escape), readChar sets l.ch to 0 at EOF and position is then advanced past the end of the input, so the trailing str += l.input[position:l.position] slices out of bounds. The smallest trigger is a quote followed by a backslash ("\), reached from SchemaWrite via Parse(). This breaks out of the escape branch when EOF is hit before position overshoots, so the malformed string lexes as a normal STRING token instead of panicking. Added a lexer test that panics without the guard.

Summary by CodeRabbit

  • Bug Fixes
    • Improved string parsing to safely handle a trailing backslash at the end of input.
    • Prevented crashes when an escape sequence is cut off before completion.
    • Added test coverage for unterminated string escapes to confirm tokenization remains stable.

Signed-off-by: Sai Asish Y <say.apm35@gmail.com>
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The DSL lexer's string-escape handling now checks for end-of-input after consuming a backslash and breaks out of the escape loop instead of continuing to dispatch escape logic. A regression test verifies that lexing an unterminated string ending in a backslash returns a STRING token without panicking.

Changes

Lexer EOF Guard

Layer / File(s) Summary
Escape EOF guard and regression test
pkg/dsl/lexer/lexer.go, pkg/dsl/lexer/lexer_test.go
Adds an l.ch == 0 check in lexString after a backslash to break the escape-processing loop, and adds a test asserting NextToken() does not panic and returns token.STRING for a trailing-backslash input.

Estimated code review effort: 1 (Trivial) | ~5 minutes

Related Issues

Suggested labels: bug, lexer

Suggested reviewers: tolgaOzen


🐰 A backslash trailed off into the night,
The lexer stumbled, lost its sight,
Now it pauses, checks, and knows the end,
No more panics 'round the bend! 🌙

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning [#3004] It fixes the panic, but it does not implement the requested syntax error behavior and the regression test is not under pkg/dsl/parser. Return a syntax error for malformed input in Parse() and add the regression test under pkg/dsl/parser as requested.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: preventing a DSL lexer panic on an unterminated string escape.
Out of Scope Changes check ✅ Passed The patch stays focused on the lexer panic fix and regression test, with no unrelated code paths changed.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/dsl/lexer/lexer_test.go (1)

1192-1200: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Also assert on tok.Literal to catch content-corruption regressions.

This test only checks the token type, so it wouldn't catch the stray-backslash literal issue flagged in lexer.go. Consider also asserting tok.Literal (e.g., expecting an empty string) once that fix lands.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/dsl/lexer/lexer_test.go` around lines 1192 - 1200, The Unterminated
string test in lexer_test.go only verifies the token type, so it can miss
literal corruption regressions from the lexer. Update the `Unterminated string
ending in a backslash does not panic` test around `NewLexer` and `NextToken` to
also assert `tok.Literal`, ideally expecting the corrected empty string literal,
so it fails if the stray-backslash content returns again.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/dsl/lexer/lexer.go`:
- Around line 212-219: The string scanning logic in lexer.go’s string-reading
path is still re-adding the trailing backslash when EOF is reached after an
escape. Update the branch in the lexer’s string token accumulation (around the
readChar/break handling in the string literal parser) so the consumed backslash
is not included in the final token value; advance the tracking position before
breaking or otherwise exclude that segment from the final append. Verify the fix
in the string token builder used by the lexer so an input ending with a lone
backslash does not return a STRING literal containing a raw backslash.

---

Nitpick comments:
In `@pkg/dsl/lexer/lexer_test.go`:
- Around line 1192-1200: The Unterminated string test in lexer_test.go only
verifies the token type, so it can miss literal corruption regressions from the
lexer. Update the `Unterminated string ending in a backslash does not panic`
test around `NewLexer` and `NextToken` to also assert `tok.Literal`, ideally
expecting the corrected empty string literal, so it fails if the stray-backslash
content returns again.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 91c011a7-5866-4a07-8421-abf372f9a378

📥 Commits

Reviewing files that changed from the base of the PR and between aa3a7c6 and 10c2894.

📒 Files selected for processing (2)
  • pkg/dsl/lexer/lexer.go
  • pkg/dsl/lexer/lexer_test.go

Comment thread pkg/dsl/lexer/lexer.go
Comment on lines 212 to +219
if l.ch == '\\' {
str += l.input[position:l.position]
l.readChar() // Skip the backslash
if l.ch == 0 {
// Backslash at end of input (unterminated escape); stop before
// position runs past the input.
break
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Stray backslash leaks into token literal on EOF-after-backslash.

The panic is fixed, but position isn't advanced before the new break, so the final str += l.input[position:l.position] (line 236) re-includes the just-consumed backslash character. For input "\, the returned STRING literal ends up containing a raw \ instead of being clean/empty, silently corrupting the parsed value instead of surfacing malformed input.

🐛 Proposed fix
 			l.readChar() // Skip the backslash
 			if l.ch == 0 {
 				// Backslash at end of input (unterminated escape); stop before
 				// position runs past the input.
+				position = l.position
 				break
 			}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if l.ch == '\\' {
str += l.input[position:l.position]
l.readChar() // Skip the backslash
if l.ch == 0 {
// Backslash at end of input (unterminated escape); stop before
// position runs past the input.
break
}
if l.ch == '\\' {
str += l.input[position:l.position]
l.readChar() // Skip the backslash
if l.ch == 0 {
// Backslash at end of input (unterminated escape); stop before
// position runs past the input.
position = l.position
break
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/dsl/lexer/lexer.go` around lines 212 - 219, The string scanning logic in
lexer.go’s string-reading path is still re-adding the trailing backslash when
EOF is reached after an escape. Update the branch in the lexer’s string token
accumulation (around the readChar/break handling in the string literal parser)
so the consumed backslash is not included in the final token value; advance the
tracking position before breaking or otherwise exclude that segment from the
final append. Verify the fix in the string token builder used by the lexer so an
input ending with a lone backslash does not return a STRING literal containing a
raw backslash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] DSL lexer panics (slice out of range) on an unterminated string

1 participant