fix: harden DoS and panic surfaces found in deep security audit#2006
Conversation
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
bashkit | e47cd39 | Commit Preview URL Branch Preview URL |
Jun 10 2026, 01:08 AM |
There was a problem hiding this comment.
Pull request overview
This PR hardens bashkit’s sandboxed interpreter and core builtins against security-audit findings that were default-reachable DoS vectors (process-fatal stack overflow / unbounded synchronous generation), plus several contained panic surfaces and limit-propagation gaps. It adds defensive caps (depth/bytes/deadline polling), corrects shell semantics around EXIT traps for eval/alias expansion, and adds regression tests documenting the threat-model fixes.
Changes:
- Hardened brace expansion and jq execution against unbounded recursion/output and synchronous-iterator hangs (new depth/byte caps + deadline polling).
- Fixed semantic and safety issues (EXIT trap behavior for
eval/aliases, checked arithmetic inexpr/date, safer credential placeholder substitution, saturating memory accounting, bounded lexer lookahead). - Added/updated threat-model documentation and regression tests for each finding.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| specs/threat-model.md | Updates threat-model status text and documents RealFs RW quota scope; adds TM-DOS-093 entry and clarifies TM-DOS-042 mitigation. |
| crates/bashkit/tests/integration/threat_model_tests.rs | Adds integration regression for brace expansion comma-list recursion bomb (TM-DOS-042). |
| crates/bashkit/tests/integration/python_security_tests.rs | Adds regression ensuring bash timeout bounds in-process Python duration. |
| crates/bashkit/tests/integration/blackbox_security_tests.rs | Adds regressions for eval EXIT-trap timing and EXIT-trap recursion termination. |
| crates/bashkit/src/parser/lexer.rs | Bounds lexer lookahead to avoid O(n²) scans (TM-DOS-024). |
| crates/bashkit/src/limits.rs | Uses saturating subtraction in function-body byte accounting to avoid underflow panics. |
| crates/bashkit/src/interpreter/mod.rs | Routes alias/eval through execute_script_body (no early EXIT trap) and hardens brace expansion with depth + output-byte caps (TM-DOS-042). |
| crates/bashkit/src/credential.rs | Refuses placeholder substitution when the placeholder repeats (fail-closed) + adds test. |
| crates/bashkit/src/builtins/sqlite/mod.rs | Sanitizes SQLite open/runtime error messages before returning to user. |
| crates/bashkit/src/builtins/python.rs | Clamps Monty runtime duration to remaining bash execution deadline. |
| crates/bashkit/src/builtins/mod.rs | Adds ExecutionDeadline::is_expired() for synchronous-loop builtins. |
| crates/bashkit/src/builtins/jq/tests.rs | Adds regression for unbounded generator output cap (TM-DOS-093). |
| crates/bashkit/src/builtins/jq/mod.rs | Enforces stdout-byte cap and periodically polls execution deadline during synchronous jaq iteration (TM-DOS-093). |
| crates/bashkit/src/builtins/expr.rs | Uses checked div/rem to avoid i64::MIN / -1 and % -1 panic; adds tests. |
| crates/bashkit/src/builtins/date.rs | Uses checked chrono arithmetic to avoid overflow panics on extreme relative offsets; adds tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The combinatorial count cap was only charged after each recursive call
returned, so the first DFS path descended one frame per brace group with
the count still zero. An input like {a,b}{a,b}... repeated tens of
thousands of times (far under max_input_bytes) stack-overflowed the
worker thread, and could accumulate O(depth*suffix) memory, before any
cap or the execution timeout could fire.
Add a recursion-depth cap and a cumulative output-byte budget
(MAX_EXPANSION_RESULT_BYTES) so a runaway expansion degrades to a bounded
literal result instead of crashing the host.
jaq evaluation is a synchronous iterator that the async execution timeout cannot preempt, and the result loop appended every produced value to an in-memory String with no byte cap, value-count cap, or deadline check. 'jq -n repeat(1)' (or range(0;1e18)) therefore grew output without limit (OOM) or spun forever producing no output (hang), reachable by default since jq is a core builtin. Cap accumulated output against the caller's max_stdout_bytes and poll a new ExecutionDeadline::is_expired() periodically so a runaway filter aborts with a clean error instead of wedging the host.
expr's / and % used raw a / b while + - * already used checked_*, so 'expr -9223372036854775808 / -1' (and % -1) overflowed: a debug-build panic, a wrong wrapping result in release. Use checked_div/checked_rem and return the existing 'integer overflow' error.
unit_duration multiplied n * 30 / n * 365 and built chrono Durations unchecked, then the caller added the offset to a base datetime unchecked. Inputs like 'date -d "30000000000000000 years"' overflowed the multiply or the chrono range and panicked. Use checked_mul + Duration::try_* and checked_add_signed, returning a 'date out of range' error instead.
eval and alias expansion called Interpreter::execute(), which runs the EXIT trap, instead of execute_script_body() like source does. This is wrong bash semantics — the EXIT trap fired at the end of every eval'd or aliased command rather than at shell exit — and because the top-level EXIT-trap block has no re-entrancy guard, 'trap "eval :" EXIT' recursed one command per level until the budget aborted, risking a stack overflow first. Route both through execute_script_body(.., run_exit_trap=false).
The byte-accounting check used self.function_body_bytes + body_bytes - old_body_bytes while the sibling record_function_insert already used saturating_sub. If accounting drifts (e.g. after a snapshot restore recomputes function sizes differently) so old_body_bytes exceeds the running total, the raw subtraction underflow-panics. Mirror the saturating arithmetic.
Placeholder mode used substring replace(), so a script could repeat the placeholder in a credential-owned header to segment/fingerprint the real secret across a header-reflecting endpoint. Substitute only when the placeholder occurs exactly once; otherwise fail closed, sending placeholders rather than the secret. (The single-echo risk on a fully reflecting approved host remains a documented v1 limitation.)
The Python builtin passed only its own PythonLimits.max_duration to Monty, ignoring the caller's ExecutionDeadline. Because the async execution timeout cannot preempt Monty's synchronous start/resume sections, a CPU-bound script could overrun a tighter bash timeout up to Monty's own (default 30s) limit. Clamp the VM duration to the remaining deadline before running, mirroring the TypeScript builtin.
read_word_or_fd_redirect collected all remaining input to inspect a
2-char redirect operator, and looks_like_assoc_assign scanned unbounded
leading whitespace. Both made a crafted input (millions of digit-led
words, or '$x=(' + megabytes of spaces) O(n) per call and the whole lex
O(n^2), bounded only by the wall-clock parser timeout (TM-DOS-024). Cap
the digit lookahead to 4 chars and reuse MAX_LOOKAHEAD for the
whitespace scan, matching looks_like_brace_expansion.
sanitize() was applied only to top-level statement errors. The open_pure_memory / open_file_engine failures and dot-command errors (which carry raw turso messages that can include crate-relative paths) were interpolated directly into stderr. Route all four emission points through sanitize() (TM-SQL-011, idempotent on already-clean messages).
- TM-DOS-029/030/031: narrative + 'NEEDED' status said open; they are mitigated/fixed. Update the prose and the verification matrix to match the status table. - TM-DOS-042: the old 'bails out of recursion when budget is hit' claim was false for comma-lists (count charged on return); document the recursion-depth + output-byte caps that actually bound it. - Add TM-DOS-093 (jq unbounded generator) and reconcile the in-code threat ID to the next sequential number. - Document that the FS byte/count quotas do not apply to RealFs ReadWrite mounts (by design; --mount-rw is sandbox-breaking).
The new overflow test used {spec:?}, which the no_debug_fmt_in_builtin_source
static guard (TM-INF-022) rejects in builtin sources. spec is a &str, so
use Display.
The clamp test set a 300ms bash timeout and unwrapped the exec result, which could panic (and flake on loaded CI) if the outer execution timeout fired and returned Err(Timeout). Use a 2s timeout and accept either a non-zero Ok exit or a Timeout Err; the elapsed-time bound (well under the 60s Python limit) is the real regression guard.
84e264a to
e47cd39
Compare
What
A deep security/bug audit of the sandboxed interpreter surfaced two default-reachable, process-fatal DoS bugs plus a set of contained panics and hardening gaps. This PR fixes all of them. No sandbox escape (real FS / network / process / host-env) was found on any surface; the exposure was DoS and a few defensive gaps.
Why
bashkit runs untrusted/LLM-generated scripts, so the script author is the attacker. A single line of script could crash the host:
echo {a,b}{a,b}…(~250 KB, under the input cap) stack-overflowed the worker thread.jq -n 'repeat(1)'grew output unboundedly / hung (jaq's synchronous iterator can't be preempted by the async timeout).How
HIGH — default-reachable, process-fatal DoS
interpreter/mod.rs): the combinatorial count cap was charged only on recursion return, so a comma-list descended one frame per group before any cap engaged. Add a recursion-depth cap and a cumulative output-byte budget (MAX_EXPANSION_RESULT_BYTES) so runaway expansion degrades to a bounded literal. (TM-DOS-042)builtins/jq/mod.rs): cap accumulated output at the caller'smax_stdout_bytesand poll a newExecutionDeadline::is_expired()every 4096 values. (TM-DOS-093)Medium / contained panics / semantics
evaland alias expansion ran bodies viaexecute(), which fires the EXIT trap (wrong bash semantics) and — with no re-entrancy guard — lettrap 'eval :' EXITrecurse. Route both throughexecute_script_body(run_exit_trap=false)likesource. (TM-DOS-030)expri64::MIN / -1(and% -1) anddate -dmonth/year multiply + chrono add → checked arithmetic.sanitize()(TM-SQL-011).MemoryBudget::check_function_insertuses saturating subtraction.Spec
RealFsReadWrite mounts (by design;--mount-rwis sandbox-breaking).Tests
Every fix has a failing-test-first regression test (brace recursion bomb, jq unbounded generator, eval EXIT-trap timing + recursion termination, expr/date overflow, credential repeat fail-closed, python deadline clamp via the public API, sqlite error paths, lexer/limits).
Verified green: lib (2445), integration with
http_client,jq,sqlite(1054),python(280),realfs(36),failpoints(14),typescript(116 + doctests);cargo fmt --checkandclippy --all-targets -- -D warningsclean. End-to-end smoke via the CLI confirms the brace bomb produces bounded output andjq -n 'repeat(1)'returns a cleanoutput limit exceedederror (rc=5), neither crashing nor hanging.Not included (need maintainer decision)
after_http), a spec-level feature.--mount-rwunmetered host access — by design; only documented here.