cuda.core: harden graph user-object destructor during Python shutdown#2074
cuda.core: harden graph user-object destructor during Python shutdown#2074aryanputta wants to merge 8 commits into
Conversation
Signed-off-by: Aryan <aryansputta@gmail.com>
Signed-off-by: Aryan <aryansputta@gmail.com>
There was a problem hiding this comment.
This looks good to me.
It’d be nice to add a comment like this in cuda_core/cuda/core/_cpp/resource_handles.hpp next to the existing py_is_finalizing() helper. It documents an important subtlety from CPython shutdown semantics: the check is only best-effort, and there is still a narrow race where non-finalizer threads can be killed on older CPython or hang on newer CPython. That means teardown in this layer can still be cut short in rare cases, so the guard reduces the risk but does not fully eliminate it.
// Best-effort probe for interpreter shutdown.
//
// In CPython this is not a hard guarantee: finalization can begin after this
// returns false but before a later PyGILState_Ensure() / other Python C-API
// call.
//
// If that race is lost on a non-finalizer thread, CPython's behavior is
// version-dependent: on older supported versions (3.10-3.13) it may abruptly the
// current thread (historically via PyThread_exit_thread(), i.e. without normal
// C++ unwinding), while on newer versions (3.14+) it may hang the thread until
// process exit.
//
// We still use this check because the policy in this layer is to avoid Python
// work once shutdown is underway and accept an intentional leak / skipped
// Python conversion in that edge case rather than add more complex deferral
// machinery.
inline bool py_is_finalizing() noexcept {
#if PY_VERSION_HEX >= 0x030D0000
return Py_IsFinalizing();
#else
return _Py_IsFinalizing() != 0;
#endif
}
Signed-off-by: Aryan <aryansputta@gmail.com>
…o fix-graph-destructor-shutdown
|
/ok to test 327b53e |
|
|
@rwgk, I’ve updated the resource handles to include the suggested comments on shutdown semantics. I believe all feedback has been addressed; please let me know if any additional modifications are needed before merging. |
|
This looks ready to merge to me, but @Andy-Jost is the original author of the changed code, I'm waiting to give him a chance to approve, too. |
|
@aryanputta at this stage it's best if you don't merge main anymore, unless the UI here reports conflicts. Each time you merge main, we have to rerun the CI again. |
Okay, sorry! |
Summary
Closes #2042.
_py_host_destructorunconditionally enteredwith gilbefore callingPy_DECREF. That is fine during normal runtime, but it is unsafe in thegraph user-object path because CUDA may invoke the destructor
asynchronously after interpreter finalization has begun.
This change makes
_py_host_destructornogil, checksPy_IsInitialized(), and only enters a GIL section when Python is stillinitialized.
Changes
cuda_core/cuda/core/graph/_utils.pyx: declarePy_IsInitialized()andharden
_py_host_destructorso it only acquires the GIL when Python isstill initialized.
cuda_core/cuda/core/graph/_utils.pxd: update the destructor signatureto match.
The normal runtime path is unchanged: if Python is still alive, the
destructor still decrefs the attached object. The change only affects the
shutdown edge where acquiring the GIL is no longer safe.
Validation
Locally verified:
git diff --checkpasses._utils.pyx/_utils.pxd.Not fully verified on this machine:
cuda.coretest suitecuda_coreRelated Work
user-object lifetime handling was extended to kernel arguments.