fix: harden DoS and panic surfaces found in deep security audit by chaliy · Pull Request #2006 · everruns/bashkit

chaliy · 2026-06-09T23:23:11Z

What

A deep security/bug audit of the sandboxed interpreter surfaced two default-reachable, process-fatal DoS bugs plus a set of contained panics and hardening gaps. This PR fixes all of them. No sandbox escape (real FS / network / process / host-env) was found on any surface; the exposure was DoS and a few defensive gaps.

Why

bashkit runs untrusted/LLM-generated scripts, so the script author is the attacker. A single line of script could crash the host:

echo {a,b}{a,b}… (~250 KB, under the input cap) stack-overflowed the worker thread.
jq -n 'repeat(1)' grew output unboundedly / hung (jaq's synchronous iterator can't be preempted by the async timeout).

How

HIGH — default-reachable, process-fatal DoS

Brace expansion (interpreter/mod.rs): the combinatorial count cap was charged only on recursion return, so a comma-list descended one frame per group before any cap engaged. Add a recursion-depth cap and a cumulative output-byte budget (MAX_EXPANSION_RESULT_BYTES) so runaway expansion degrades to a bounded literal. (TM-DOS-042)
jq (builtins/jq/mod.rs): cap accumulated output at the caller's max_stdout_bytes and poll a new ExecutionDeadline::is_expired() every 4096 values. (TM-DOS-093)

Medium / contained panics / semantics

eval and alias expansion ran bodies via execute(), which fires the EXIT trap (wrong bash semantics) and — with no re-entrancy guard — let trap 'eval :' EXIT recurse. Route both through execute_script_body(run_exit_trap=false) like source. (TM-DOS-030)
expr i64::MIN / -1 (and % -1) and date -d month/year multiply + chrono add → checked arithmetic.
Credential placeholder mode: refuse substitution when the placeholder is repeated (blocks secret segmentation/fingerprinting); fail closed.
Python (Monty) duration now clamped to the caller's execution deadline (mirrors the TypeScript builtin).
SQLite open/dot-command errors now routed through sanitize() (TM-SQL-011).
MemoryBudget::check_function_insert uses saturating subtraction.
Two O(n²) lexer lookahead scans bounded (TM-DOS-024).

Spec

Corrected threat-model drift (TM-DOS-029/030/031 were marked open/NEEDED but are fixed; fixed the misleading TM-DOS-042 mitigation text), added TM-DOS-093, and documented that FS byte/count quotas do not apply to RealFs ReadWrite mounts (by design; --mount-rw is sandbox-breaking).

Tests

Every fix has a failing-test-first regression test (brace recursion bomb, jq unbounded generator, eval EXIT-trap timing + recursion termination, expr/date overflow, credential repeat fail-closed, python deadline clamp via the public API, sqlite error paths, lexer/limits).

Verified green: lib (2445), integration with http_client,jq,sqlite (1054), python (280), realfs (36), failpoints (14), typescript (116 + doctests); cargo fmt --check and clippy --all-targets -- -D warnings clean. End-to-end smoke via the CLI confirms the brace bomb produces bounded output and jq -n 'repeat(1)' returns a clean output limit exceeded error (rc=5), neither crashing nor hanging.

Not included (need maintainer decision)

SSH strict-host-key UX cliff (disabled mode accepts any key with only a warning) — a design choice (consider key-pinning-on-first-use).
Credential single-echo on a fully header-reflecting approved host — needs response scrubbing (after_http), a spec-level feature.
--mount-rw unmetered host access — by design; only documented here.

cloudflare-workers-and-pages · 2026-06-09T23:23:16Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	bashkit	`e47cd39`	Commit Preview URL Branch Preview URL	Jun 10 2026, 01:08 AM

Copilot

Pull request overview

This PR hardens bashkit’s sandboxed interpreter and core builtins against security-audit findings that were default-reachable DoS vectors (process-fatal stack overflow / unbounded synchronous generation), plus several contained panic surfaces and limit-propagation gaps. It adds defensive caps (depth/bytes/deadline polling), corrects shell semantics around EXIT traps for eval/alias expansion, and adds regression tests documenting the threat-model fixes.

Changes:

Hardened brace expansion and jq execution against unbounded recursion/output and synchronous-iterator hangs (new depth/byte caps + deadline polling).
Fixed semantic and safety issues (EXIT trap behavior for eval/aliases, checked arithmetic in expr/date, safer credential placeholder substitution, saturating memory accounting, bounded lexer lookahead).
Added/updated threat-model documentation and regression tests for each finding.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
specs/threat-model.md	Updates threat-model status text and documents RealFs RW quota scope; adds TM-DOS-093 entry and clarifies TM-DOS-042 mitigation.
crates/bashkit/tests/integration/threat_model_tests.rs	Adds integration regression for brace expansion comma-list recursion bomb (TM-DOS-042).
crates/bashkit/tests/integration/python_security_tests.rs	Adds regression ensuring bash timeout bounds in-process Python duration.
crates/bashkit/tests/integration/blackbox_security_tests.rs	Adds regressions for `eval` EXIT-trap timing and EXIT-trap recursion termination.
crates/bashkit/src/parser/lexer.rs	Bounds lexer lookahead to avoid O(n²) scans (TM-DOS-024).
crates/bashkit/src/limits.rs	Uses saturating subtraction in function-body byte accounting to avoid underflow panics.
crates/bashkit/src/interpreter/mod.rs	Routes alias/eval through `execute_script_body` (no early EXIT trap) and hardens brace expansion with depth + output-byte caps (TM-DOS-042).
crates/bashkit/src/credential.rs	Refuses placeholder substitution when the placeholder repeats (fail-closed) + adds test.
crates/bashkit/src/builtins/sqlite/mod.rs	Sanitizes SQLite open/runtime error messages before returning to user.
crates/bashkit/src/builtins/python.rs	Clamps Monty runtime duration to remaining bash execution deadline.
crates/bashkit/src/builtins/mod.rs	Adds `ExecutionDeadline::is_expired()` for synchronous-loop builtins.
crates/bashkit/src/builtins/jq/tests.rs	Adds regression for unbounded generator output cap (TM-DOS-093).
crates/bashkit/src/builtins/jq/mod.rs	Enforces stdout-byte cap and periodically polls execution deadline during synchronous jaq iteration (TM-DOS-093).
crates/bashkit/src/builtins/expr.rs	Uses checked div/rem to avoid `i64::MIN / -1` and `% -1` panic; adds tests.
crates/bashkit/src/builtins/date.rs	Uses checked chrono arithmetic to avoid overflow panics on extreme relative offsets; adds tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The combinatorial count cap was only charged after each recursive call returned, so the first DFS path descended one frame per brace group with the count still zero. An input like {a,b}{a,b}... repeated tens of thousands of times (far under max_input_bytes) stack-overflowed the worker thread, and could accumulate O(depth*suffix) memory, before any cap or the execution timeout could fire. Add a recursion-depth cap and a cumulative output-byte budget (MAX_EXPANSION_RESULT_BYTES) so a runaway expansion degrades to a bounded literal result instead of crashing the host.

jaq evaluation is a synchronous iterator that the async execution timeout cannot preempt, and the result loop appended every produced value to an in-memory String with no byte cap, value-count cap, or deadline check. 'jq -n repeat(1)' (or range(0;1e18)) therefore grew output without limit (OOM) or spun forever producing no output (hang), reachable by default since jq is a core builtin. Cap accumulated output against the caller's max_stdout_bytes and poll a new ExecutionDeadline::is_expired() periodically so a runaway filter aborts with a clean error instead of wedging the host.

expr's / and % used raw a / b while + - * already used checked_*, so 'expr -9223372036854775808 / -1' (and % -1) overflowed: a debug-build panic, a wrong wrapping result in release. Use checked_div/checked_rem and return the existing 'integer overflow' error.

unit_duration multiplied n * 30 / n * 365 and built chrono Durations unchecked, then the caller added the offset to a base datetime unchecked. Inputs like 'date -d "30000000000000000 years"' overflowed the multiply or the chrono range and panicked. Use checked_mul + Duration::try_* and checked_add_signed, returning a 'date out of range' error instead.

eval and alias expansion called Interpreter::execute(), which runs the EXIT trap, instead of execute_script_body() like source does. This is wrong bash semantics — the EXIT trap fired at the end of every eval'd or aliased command rather than at shell exit — and because the top-level EXIT-trap block has no re-entrancy guard, 'trap "eval :" EXIT' recursed one command per level until the budget aborted, risking a stack overflow first. Route both through execute_script_body(.., run_exit_trap=false).

The byte-accounting check used self.function_body_bytes + body_bytes - old_body_bytes while the sibling record_function_insert already used saturating_sub. If accounting drifts (e.g. after a snapshot restore recomputes function sizes differently) so old_body_bytes exceeds the running total, the raw subtraction underflow-panics. Mirror the saturating arithmetic.

Placeholder mode used substring replace(), so a script could repeat the placeholder in a credential-owned header to segment/fingerprint the real secret across a header-reflecting endpoint. Substitute only when the placeholder occurs exactly once; otherwise fail closed, sending placeholders rather than the secret. (The single-echo risk on a fully reflecting approved host remains a documented v1 limitation.)

The Python builtin passed only its own PythonLimits.max_duration to Monty, ignoring the caller's ExecutionDeadline. Because the async execution timeout cannot preempt Monty's synchronous start/resume sections, a CPU-bound script could overrun a tighter bash timeout up to Monty's own (default 30s) limit. Clamp the VM duration to the remaining deadline before running, mirroring the TypeScript builtin.

read_word_or_fd_redirect collected all remaining input to inspect a 2-char redirect operator, and looks_like_assoc_assign scanned unbounded leading whitespace. Both made a crafted input (millions of digit-led words, or '$x=(' + megabytes of spaces) O(n) per call and the whole lex O(n^2), bounded only by the wall-clock parser timeout (TM-DOS-024). Cap the digit lookahead to 4 chars and reuse MAX_LOOKAHEAD for the whitespace scan, matching looks_like_brace_expansion.

sanitize() was applied only to top-level statement errors. The open_pure_memory / open_file_engine failures and dot-command errors (which carry raw turso messages that can include crate-relative paths) were interpolated directly into stderr. Route all four emission points through sanitize() (TM-SQL-011, idempotent on already-clean messages).

- TM-DOS-029/030/031: narrative + 'NEEDED' status said open; they are mitigated/fixed. Update the prose and the verification matrix to match the status table. - TM-DOS-042: the old 'bails out of recursion when budget is hit' claim was false for comma-lists (count charged on return); document the recursion-depth + output-byte caps that actually bound it. - Add TM-DOS-093 (jq unbounded generator) and reconcile the in-code threat ID to the next sequential number. - Document that the FS byte/count quotas do not apply to RealFs ReadWrite mounts (by design; --mount-rw is sandbox-breaking).

The new overflow test used {spec:?}, which the no_debug_fmt_in_builtin_source static guard (TM-INF-022) rejects in builtin sources. spec is a &str, so use Display.

The clamp test set a 300ms bash timeout and unwrapped the exec result, which could panic (and flake on loaded CI) if the outer execution timeout fired and returned Err(Timeout). Use a 2s timeout and accept either a non-zero Ok exit or a Timeout Err; the elapsed-time bound (well under the 60s Python limit) is the real regression guard.

Copilot AI review requested due to automatic review settings June 9, 2026 23:23

Copilot started reviewing on behalf of chaliy June 9, 2026 23:23 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

Comment thread crates/bashkit/tests/integration/python_security_tests.rs Outdated

chaliy added 13 commits June 10, 2026 01:06

test(date): use Display in overflow-assert message

f85621d

The new overflow test used {spec:?}, which the no_debug_fmt_in_builtin_source static guard (TM-INF-022) rejects in builtin sources. spec is a &str, so use Display.

chaliy force-pushed the claude/vigilant-fermat-tztypx branch from 84e264a to e47cd39 Compare June 10, 2026 01:08

chaliy merged commit 61080c6 into main Jun 10, 2026
35 checks passed

chaliy deleted the claude/vigilant-fermat-tztypx branch June 10, 2026 01:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: harden DoS and panic surfaces found in deep security audit#2006

fix: harden DoS and panic surfaces found in deep security audit#2006
chaliy merged 13 commits into
mainfrom
claude/vigilant-fermat-tztypx

chaliy commented Jun 9, 2026 •

edited

Loading

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chaliy commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Tests

Not included (need maintainer decision)

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chaliy commented Jun 9, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Jun 9, 2026 •

edited

Loading