Skip to content

Scrub CR/LF in SSE outputs to prevent field forging (fixes #18)#22

Closed
andriytyurnikov wants to merge 12 commits intostarfederation:mainfrom
rubakas:feature/sse-injection-scrubbing
Closed

Scrub CR/LF in SSE outputs to prevent field forging (fixes #18)#22
andriytyurnikov wants to merge 12 commits intostarfederation:mainfrom
rubakas:feature/sse-injection-scrubbing

Conversation

@andriytyurnikov
Copy link
Copy Markdown

Fixes #18.

Summary

Closes the SSE injection surface where caller-supplied strings reaching:

  • element bodies (patch_elements),
  • String signal payloads (patch_signals),
  • script bodies and attribute values (execute_script, redirect),
  • scalar option values (id, selector, mode, useViewTransition, etc.),
  • array/hash option entries,

could forge SSE fields the browser then dispatches as legitimate events. The WHATWG SSE parser treats \r, \n, and \r\n all as line terminators.

Approach

Two private helpers on ServerSentEventGenerator, applied at the API boundary (matches the sketch in #18):

Helper What it strips Where it runs
`scrub_body` `\r` only (`\n` preserved — used as the splitter) element bodies, script bodies, String signal payloads
`scrub_option` (recursive) both `\r` and `\n` every value passing through `build_options`; `execute_script` attributes

`scrub_option` walks Strings, Arrays, and Hashes; Ints/Floats/Booleans pass through. Hash signal payloads are unaffected because `JSON.dump` already escapes `\r`/`\n` in string values, so the serialized payload is always one safe line.

`execute_script` attribute values are scrubbed before the `<script>` tag is constructed, because they're interpolated raw into the tag and would otherwise bypass `patch_elements` element-body splitter.

Behavior change

Calls that don't pass \r/\n through user-controlled paths see no change. Existing 100-spec baseline still green. Calls that do pass through CR/LF strings now see those stripped (bodies) or normalized into the parent data: line (options) — which is what callers intended in 100% of legitimate cases.

Tests

17 new specs under `Dispatcher SSE injection guards`, including:

  • per-method `\r`/`\n`/`\r\n` coverage for `#patch_elements`, `#patch_signals`, `#remove_elements`, `#execute_script`, `#redirect`,
  • option-shape coverage for scalar, array, and hash entries,
  • a top-level test reproducing the original issue's attack vector and asserting only legitimate `event:` headers reach the browser.

Full suite: 117 examples, 0 failures.

Verification

The issue's exact reproduction script now produces only the 4 legitimate event: datastar-patch-elements headers (one per call). Strings that were previously parsed as forged SSE fields (event: forged, injected: evil) are now absorbed as text into the parent data: lines.

Test plan

Previously CI only exercised Ruby 4.0. Adds a build matrix so unit
tests and the Datastar SDK conformance suite run on every supported
Ruby. Also enables the workflow on pull_request events.
CI runs on ubuntu-latest. Without x86_64-linux locked, Bundler resolves
platform-specific gems at install time on CI, which can cause version
drift relative to local development. Locking the platform keeps CI
reproducible.
bundle install fails on Ruby 3.1 because console 1.34.2 (transitive
dep via async) requires ruby >= 3.2. The dependency tree's effective
floor is 3.2.
Adds an optional generator_class: keyword to Dispatcher.new (defaulting
to ServerSentEventGenerator) and routes the three internal generator
instantiations — stream_one, stream_many's connection generator, and
each per-stream generator inside spawned threads — through it.

Lets downstream wrappers layer behavior (logging, metrics, input
scrubbing, etc.) on top of the SDK by passing a subclass, instead of
monkey-patching the SDK class.
Pre-fix #redirect interpolated the URL raw inside a single-quoted JS
string:

  window.location = '#{url}'

A ' in url broke out of the literal, letting attacker-influenced
fragments execute as JS. The classic vulnerable pattern:

  datastar.redirect("/page?ref=#{params[:ref]}")

is a Rails idiom that's safe under Rails' own redirect_to, so the
mismatch is a footgun.

Fix: encode the URL with JSON.generate(ascii_only: true,
escape_slash: true). The output is a properly-quoted JS string
literal that:

  - escapes ", \, and control characters,
  - escapes U+2028 / U+2029 (which terminate JS string literals
    even when delimited by " or '),
  - escapes / to \/ so a </script> substring in the URL can't
    prematurely close the surrounding <script> tag during HTML
    parsing (the parser does not recognize <\/script> as the end
    tag, while \/ is a no-op inside a JS string literal).

Output format change: window.location is now wrapped in double
quotes with JSON-style escapes instead of single quotes. Behavior
at runtime is unchanged for safe URLs.

6 new specs cover single quotes, double quotes, backslashes,
</script> breakout, U+2028, and U+2029.
stream_one's lifecycle contract (handling_sync_errors) fires exactly
one of on_server_disconnect / on_client_disconnect / on_error per
stream — never a combination. stream_many violated this: its
ensure block always fired on_server_disconnect, even when an error
or client disconnect had already triggered the matching callback.

Consumers writing observability/cleanup logic against these
callbacks would see different behavior depending on whether they
called .stream once or many times — a subtle source of bugs.

Fix: track a completed_normally flag in stream_many's control
thread, set it false when handle_streaming_error runs, and only
fire on_server_disconnect in the ensure block when the flag is
still true.

2 new shared examples (run for both ThreadExecutor and
AsyncExecutor):
- streamer raises → on_error fires, on_server_disconnect doesn't
- client disconnects → on_client_disconnect fires, on_server_disconnect doesn't
@andriytyurnikov
Copy link
Copy Markdown
Author

Upstream CI hasn't fired on this PR (workflow is on: push only). Fork CI run on the same commit, green: https://github.com/rubakas/datastar-ruby/actions/runs/24990606156 (Ruby 4.0 on Linux + full Datastar SDK conformance suite). 117 examples, 0 failures — 17 new specs cover element bodies, scalar/array/hash options, String and Hash signals, remove_elements, execute_script (body and attributes), redirect, and a top-level reproduction of the original issue.

@andriytyurnikov andriytyurnikov force-pushed the feature/sse-injection-scrubbing branch from c173c0d to 39a6094 Compare April 27, 2026 11:51
CI: run matrix against Ruby 3.2, 3.3, 3.4, 4.0
Lock x86_64-linux platform in Gemfile.lock
Allow Dispatcher to inject a custom generator_class
Make multi-stream disconnect callbacks mutually exclusive
Encode redirect URL as a safe JS string literal
Closes the SSE injection surface where attacker-controlled strings
reaching element bodies, script bodies, attribute values, scalar
option values, or array/hash option entries could forge SSE fields
the browser then dispatches as legitimate events. The WHATWG SSE
parser treats \r, \n, and \r\n all as line terminators.

Two scrubbers, applied at the API boundary:

- scrub_body: strip \r only (used for element/script bodies and
  String signal payloads). \n is preserved because the SDK splits
  bodies on \n to emit per-line `data:` fields.
- scrub_option: recursively strip \r and \n from option values
  (Strings, Arrays, Hashes). Option values are written as single
  `data:` lines, so any embedded line terminator forges a field.

Hash signals are unaffected — JSON.dump already escapes \r/\n in
string values, so the serialized payload is always a single safe
line.

execute_script attribute values are scrubbed pre-tag-construction
since they are interpolated raw and bypass patch_elements' splitter.

17 new specs covering element bodies, scalar/array/hash options,
String and Hash signals, remove_elements, execute_script (body
and attributes), redirect, and a top-level reproduction of the
issue's attack vector.
@andriytyurnikov andriytyurnikov force-pushed the feature/sse-injection-scrubbing branch from 39a6094 to 89010fc Compare April 27, 2026 12:01
@andriytyurnikov andriytyurnikov deleted the feature/sse-injection-scrubbing branch April 27, 2026 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SSE injection: bare \r and \n in scalar option values survive serialization

1 participant