Skip to content

fix(python): keep private-loop worker off Python during interpreter exit#2008

Merged
chaliy merged 2 commits into
mainfrom
claude/determined-goldberg-1n00vy
Jun 10, 2026
Merged

fix(python): keep private-loop worker off Python during interpreter exit#2008
chaliy merged 2 commits into
mainfrom
claude/determined-goldberg-1n00vy

Conversation

@chaliy

@chaliy chaliy commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

What

Fixes the remaining red on main after #2007: the python.yml Examples job crashed with SIGABRT (core dumped) in langgraph_async_tool.py at process exit — flaky, ~25% of runs (it passed on the #2007 PR run, failed on the main push run).

Why

Core-dump analysis: the bashkit-py-loop private-loop worker thread wakes from recv() when the callback engine is gc'd — which commonly happens inside Py_Finalize's GC pass — and called Python::attach to close its asyncio event loop. Attaching a fresh thread state during interpreter finalization fatals CPython:

Fatal Python error: PyGILState_Release: thread state ... must be current when releasing
Python runtime state: finalizing

Python::try_attach was tried first and does not help: its finalization check is compiled only for Python ≥ 3.13, and Py_IsInitialized() still returns 1 during Py_FinalizeEx's GC on older versions (confirmed with a second core dump showing the abort inside try_attach itself).

This race predates #2007 — it shipped with the private-loop worker redesign (#1918) — but was masked because every python.yml run on main since June 6 was cancelled by subsequent pushes.

How

The worker's exit path no longer touches Python at all: the loop's Py ref is dropped unattached (pyo3 safely defers the decref) and the loop is closed by asyncio's BaseEventLoop.__del__ when the deferred decref runs, or reclaimed by the OS at process exit. Documented as TM-PY-030 variant (3) in specs/threat-model.md.

Tests

  • Before: langgraph_async_tool.py aborted 6/30 runs (and 2/5, 1/10 in other rounds). After: 0/80 across two 40-run stress rounds.
  • Full bashkit-python pytest suite: 700 passed, 1 skipped.
  • cargo fmt --check / cargo clippy -p bashkit-python --all-targets clean.
  • The Examples CI job itself is the ongoing regression signal for this exit-time race (it runs the crashing example on every PR).

Generated by Claude Code

The python.yml Examples job crashed flaky (SIGABRT, ~25% of runs) in
langgraph_async_tool.py at process exit: the bashkit-py-loop worker
thread wakes when the engine is gc'd — commonly inside Py_Finalize —
and called Python::attach to close its asyncio loop. Attaching a fresh
thread state during finalization fatals CPython with
'PyGILState_Release: thread state must be current when releasing'.
Python::try_attach does not help: its finalization check is compiled
only for Python >= 3.13 and Py_IsInitialized() still returns 1 during
Py_FinalizeEx's GC on older versions (verified via core dumps).

The worker exit path no longer touches Python at all: the loop's Py ref
is dropped unattached (pyo3 defers the decref) and the loop is closed
by asyncio's BaseEventLoop.__del__ when the decref runs, or reclaimed
by the OS at process exit.

Documented as TM-PY-030 variant (3).

Verified: example aborted 6/30 runs before, 0/80 across two stress runs
after; full bashkit-python pytest suite passes (700 passed, 1 skipped).
Copilot AI review requested due to automatic review settings June 10, 2026 01:27
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 10, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 499a94c Commit Preview URL

Branch Preview URL
Jun 10 2026, 01:32 AM

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a flaky interpreter-exit crash in the bashkit-py-loop private-loop worker by ensuring the worker thread’s shutdown path does not attach to Python during Py_Finalize GC/finalization, avoiding CPython fatal aborts at process exit.

Changes:

  • Remove Python interaction on the private-loop worker thread exit path (drop the loop Py ref unattached instead of calling into asyncio to close it).
  • Document the additional TM-PY-030 variant (interpreter-exit attach crash) in the threat model.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
specs/threat-model.md Documents TM-PY-030 variant (3) describing the interpreter-exit crash scenario and mitigation.
crates/bashkit-python/src/lib.rs Stops attaching to Python on private-loop worker shutdown; drops the event loop reference without touching Python.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread specs/threat-model.md
Comment on lines 1767 to +1771
and dropped the last `Arc<Runtime>`; tokio's default `Runtime::drop` joins in-flight
blocking tasks, and an abandoned (timed-out) callback task must re-attach to finish —
freezing the entire interpreter. The `PyRuntime` handle now shuts the runtime down
with `shutdown_background()` on last drop. Regression tests:
with `shutdown_background()` on last drop. (3) The private-loop worker thread called
`Python::attach` on its exit path to close its asyncio loop; the worker usually wakes
Review feedback: the table row only described the two deadlock variants
while the paragraph below documents the interpreter-exit SIGABRT too.
@chaliy chaliy merged commit 7a03f7c into main Jun 10, 2026
25 checks passed
@chaliy chaliy deleted the claude/determined-goldberg-1n00vy branch June 10, 2026 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants