Skip to content

fix: harden DoS and panic surfaces found in deep security audit#2006

Merged
chaliy merged 13 commits into
mainfrom
claude/vigilant-fermat-tztypx
Jun 10, 2026
Merged

fix: harden DoS and panic surfaces found in deep security audit#2006
chaliy merged 13 commits into
mainfrom
claude/vigilant-fermat-tztypx

Conversation

@chaliy

@chaliy chaliy commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

What

A deep security/bug audit of the sandboxed interpreter surfaced two default-reachable, process-fatal DoS bugs plus a set of contained panics and hardening gaps. This PR fixes all of them. No sandbox escape (real FS / network / process / host-env) was found on any surface; the exposure was DoS and a few defensive gaps.

Why

bashkit runs untrusted/LLM-generated scripts, so the script author is the attacker. A single line of script could crash the host:

  • echo {a,b}{a,b}… (~250 KB, under the input cap) stack-overflowed the worker thread.
  • jq -n 'repeat(1)' grew output unboundedly / hung (jaq's synchronous iterator can't be preempted by the async timeout).

How

HIGH — default-reachable, process-fatal DoS

  • Brace expansion (interpreter/mod.rs): the combinatorial count cap was charged only on recursion return, so a comma-list descended one frame per group before any cap engaged. Add a recursion-depth cap and a cumulative output-byte budget (MAX_EXPANSION_RESULT_BYTES) so runaway expansion degrades to a bounded literal. (TM-DOS-042)
  • jq (builtins/jq/mod.rs): cap accumulated output at the caller's max_stdout_bytes and poll a new ExecutionDeadline::is_expired() every 4096 values. (TM-DOS-093)

Medium / contained panics / semantics

  • eval and alias expansion ran bodies via execute(), which fires the EXIT trap (wrong bash semantics) and — with no re-entrancy guard — let trap 'eval :' EXIT recurse. Route both through execute_script_body(run_exit_trap=false) like source. (TM-DOS-030)
  • expr i64::MIN / -1 (and % -1) and date -d month/year multiply + chrono add → checked arithmetic.
  • Credential placeholder mode: refuse substitution when the placeholder is repeated (blocks secret segmentation/fingerprinting); fail closed.
  • Python (Monty) duration now clamped to the caller's execution deadline (mirrors the TypeScript builtin).
  • SQLite open/dot-command errors now routed through sanitize() (TM-SQL-011).
  • MemoryBudget::check_function_insert uses saturating subtraction.
  • Two O(n²) lexer lookahead scans bounded (TM-DOS-024).

Spec

  • Corrected threat-model drift (TM-DOS-029/030/031 were marked open/NEEDED but are fixed; fixed the misleading TM-DOS-042 mitigation text), added TM-DOS-093, and documented that FS byte/count quotas do not apply to RealFs ReadWrite mounts (by design; --mount-rw is sandbox-breaking).

Tests

Every fix has a failing-test-first regression test (brace recursion bomb, jq unbounded generator, eval EXIT-trap timing + recursion termination, expr/date overflow, credential repeat fail-closed, python deadline clamp via the public API, sqlite error paths, lexer/limits).

Verified green: lib (2445), integration with http_client,jq,sqlite (1054), python (280), realfs (36), failpoints (14), typescript (116 + doctests); cargo fmt --check and clippy --all-targets -- -D warnings clean. End-to-end smoke via the CLI confirms the brace bomb produces bounded output and jq -n 'repeat(1)' returns a clean output limit exceeded error (rc=5), neither crashing nor hanging.

Not included (need maintainer decision)

  • SSH strict-host-key UX cliff (disabled mode accepts any key with only a warning) — a design choice (consider key-pinning-on-first-use).
  • Credential single-echo on a fully header-reflecting approved host — needs response scrubbing (after_http), a spec-level feature.
  • --mount-rw unmetered host access — by design; only documented here.

Copilot AI review requested due to automatic review settings June 9, 2026 23:23
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 9, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit e47cd39 Commit Preview URL

Branch Preview URL
Jun 10 2026, 01:08 AM

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens bashkit’s sandboxed interpreter and core builtins against security-audit findings that were default-reachable DoS vectors (process-fatal stack overflow / unbounded synchronous generation), plus several contained panic surfaces and limit-propagation gaps. It adds defensive caps (depth/bytes/deadline polling), corrects shell semantics around EXIT traps for eval/alias expansion, and adds regression tests documenting the threat-model fixes.

Changes:

  • Hardened brace expansion and jq execution against unbounded recursion/output and synchronous-iterator hangs (new depth/byte caps + deadline polling).
  • Fixed semantic and safety issues (EXIT trap behavior for eval/aliases, checked arithmetic in expr/date, safer credential placeholder substitution, saturating memory accounting, bounded lexer lookahead).
  • Added/updated threat-model documentation and regression tests for each finding.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
specs/threat-model.md Updates threat-model status text and documents RealFs RW quota scope; adds TM-DOS-093 entry and clarifies TM-DOS-042 mitigation.
crates/bashkit/tests/integration/threat_model_tests.rs Adds integration regression for brace expansion comma-list recursion bomb (TM-DOS-042).
crates/bashkit/tests/integration/python_security_tests.rs Adds regression ensuring bash timeout bounds in-process Python duration.
crates/bashkit/tests/integration/blackbox_security_tests.rs Adds regressions for eval EXIT-trap timing and EXIT-trap recursion termination.
crates/bashkit/src/parser/lexer.rs Bounds lexer lookahead to avoid O(n²) scans (TM-DOS-024).
crates/bashkit/src/limits.rs Uses saturating subtraction in function-body byte accounting to avoid underflow panics.
crates/bashkit/src/interpreter/mod.rs Routes alias/eval through execute_script_body (no early EXIT trap) and hardens brace expansion with depth + output-byte caps (TM-DOS-042).
crates/bashkit/src/credential.rs Refuses placeholder substitution when the placeholder repeats (fail-closed) + adds test.
crates/bashkit/src/builtins/sqlite/mod.rs Sanitizes SQLite open/runtime error messages before returning to user.
crates/bashkit/src/builtins/python.rs Clamps Monty runtime duration to remaining bash execution deadline.
crates/bashkit/src/builtins/mod.rs Adds ExecutionDeadline::is_expired() for synchronous-loop builtins.
crates/bashkit/src/builtins/jq/tests.rs Adds regression for unbounded generator output cap (TM-DOS-093).
crates/bashkit/src/builtins/jq/mod.rs Enforces stdout-byte cap and periodically polls execution deadline during synchronous jaq iteration (TM-DOS-093).
crates/bashkit/src/builtins/expr.rs Uses checked div/rem to avoid i64::MIN / -1 and % -1 panic; adds tests.
crates/bashkit/src/builtins/date.rs Uses checked chrono arithmetic to avoid overflow panics on extreme relative offsets; adds tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/bashkit/tests/integration/python_security_tests.rs Outdated
chaliy added 13 commits June 10, 2026 01:06
The combinatorial count cap was only charged after each recursive call
returned, so the first DFS path descended one frame per brace group with
the count still zero. An input like {a,b}{a,b}... repeated tens of
thousands of times (far under max_input_bytes) stack-overflowed the
worker thread, and could accumulate O(depth*suffix) memory, before any
cap or the execution timeout could fire.

Add a recursion-depth cap and a cumulative output-byte budget
(MAX_EXPANSION_RESULT_BYTES) so a runaway expansion degrades to a bounded
literal result instead of crashing the host.
jaq evaluation is a synchronous iterator that the async execution
timeout cannot preempt, and the result loop appended every produced
value to an in-memory String with no byte cap, value-count cap, or
deadline check. 'jq -n repeat(1)' (or range(0;1e18)) therefore grew
output without limit (OOM) or spun forever producing no output (hang),
reachable by default since jq is a core builtin.

Cap accumulated output against the caller's max_stdout_bytes and poll a
new ExecutionDeadline::is_expired() periodically so a runaway filter
aborts with a clean error instead of wedging the host.
expr's / and % used raw a / b while + - * already used checked_*, so
'expr -9223372036854775808 / -1' (and % -1) overflowed: a debug-build
panic, a wrong wrapping result in release. Use checked_div/checked_rem
and return the existing 'integer overflow' error.
unit_duration multiplied n * 30 / n * 365 and built chrono Durations
unchecked, then the caller added the offset to a base datetime unchecked.
Inputs like 'date -d "30000000000000000 years"' overflowed the multiply
or the chrono range and panicked. Use checked_mul + Duration::try_* and
checked_add_signed, returning a 'date out of range' error instead.
eval and alias expansion called Interpreter::execute(), which runs the
EXIT trap, instead of execute_script_body() like source does. This is
wrong bash semantics — the EXIT trap fired at the end of every eval'd or
aliased command rather than at shell exit — and because the top-level
EXIT-trap block has no re-entrancy guard, 'trap "eval :" EXIT' recursed
one command per level until the budget aborted, risking a stack overflow
first. Route both through execute_script_body(.., run_exit_trap=false).
The byte-accounting check used self.function_body_bytes + body_bytes -
old_body_bytes while the sibling record_function_insert already used
saturating_sub. If accounting drifts (e.g. after a snapshot restore
recomputes function sizes differently) so old_body_bytes exceeds the
running total, the raw subtraction underflow-panics. Mirror the
saturating arithmetic.
Placeholder mode used substring replace(), so a script could repeat the
placeholder in a credential-owned header to segment/fingerprint the real
secret across a header-reflecting endpoint. Substitute only when the
placeholder occurs exactly once; otherwise fail closed, sending
placeholders rather than the secret. (The single-echo risk on a fully
reflecting approved host remains a documented v1 limitation.)
The Python builtin passed only its own PythonLimits.max_duration to
Monty, ignoring the caller's ExecutionDeadline. Because the async
execution timeout cannot preempt Monty's synchronous start/resume
sections, a CPU-bound script could overrun a tighter bash timeout up to
Monty's own (default 30s) limit. Clamp the VM duration to the remaining
deadline before running, mirroring the TypeScript builtin.
read_word_or_fd_redirect collected all remaining input to inspect a
2-char redirect operator, and looks_like_assoc_assign scanned unbounded
leading whitespace. Both made a crafted input (millions of digit-led
words, or '$x=(' + megabytes of spaces) O(n) per call and the whole lex
O(n^2), bounded only by the wall-clock parser timeout (TM-DOS-024). Cap
the digit lookahead to 4 chars and reuse MAX_LOOKAHEAD for the
whitespace scan, matching looks_like_brace_expansion.
sanitize() was applied only to top-level statement errors. The
open_pure_memory / open_file_engine failures and dot-command errors
(which carry raw turso messages that can include crate-relative paths)
were interpolated directly into stderr. Route all four emission points
through sanitize() (TM-SQL-011, idempotent on already-clean messages).
- TM-DOS-029/030/031: narrative + 'NEEDED' status said open; they are
  mitigated/fixed. Update the prose and the verification matrix to match
  the status table.
- TM-DOS-042: the old 'bails out of recursion when budget is hit' claim
  was false for comma-lists (count charged on return); document the
  recursion-depth + output-byte caps that actually bound it.
- Add TM-DOS-093 (jq unbounded generator) and reconcile the in-code
  threat ID to the next sequential number.
- Document that the FS byte/count quotas do not apply to RealFs
  ReadWrite mounts (by design; --mount-rw is sandbox-breaking).
The new overflow test used {spec:?}, which the no_debug_fmt_in_builtin_source
static guard (TM-INF-022) rejects in builtin sources. spec is a &str, so
use Display.
The clamp test set a 300ms bash timeout and unwrapped the exec result,
which could panic (and flake on loaded CI) if the outer execution
timeout fired and returned Err(Timeout). Use a 2s timeout and accept
either a non-zero Ok exit or a Timeout Err; the elapsed-time bound (well
under the 60s Python limit) is the real regression guard.
@chaliy chaliy force-pushed the claude/vigilant-fermat-tztypx branch from 84e264a to e47cd39 Compare June 10, 2026 01:08
@chaliy chaliy merged commit 61080c6 into main Jun 10, 2026
35 checks passed
@chaliy chaliy deleted the claude/vigilant-fermat-tztypx branch June 10, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants