Skip to content

cuda.core: harden graph user-object destructor during Python shutdown#2074

Open
aryanputta wants to merge 8 commits into
NVIDIA:mainfrom
aryanputta:fix-graph-destructor-shutdown
Open

cuda.core: harden graph user-object destructor during Python shutdown#2074
aryanputta wants to merge 8 commits into
NVIDIA:mainfrom
aryanputta:fix-graph-destructor-shutdown

Conversation

@aryanputta
Copy link
Copy Markdown
Contributor

@aryanputta aryanputta commented May 12, 2026

Summary

Closes #2042.

_py_host_destructor unconditionally entered with gil before calling
Py_DECREF. That is fine during normal runtime, but it is unsafe in the
graph user-object path because CUDA may invoke the destructor
asynchronously after interpreter finalization has begun.

This change makes _py_host_destructor nogil, checks
Py_IsInitialized(), and only enters a GIL section when Python is still
initialized.

Changes

  • cuda_core/cuda/core/graph/_utils.pyx: declare Py_IsInitialized() and
    harden _py_host_destructor so it only acquires the GIL when Python is
    still initialized.
  • cuda_core/cuda/core/graph/_utils.pxd: update the destructor signature
    to match.

The normal runtime path is unchanged: if Python is still alive, the
destructor still decrefs the attached object. The change only affects the
shutdown edge where acquiring the GIL is no longer safe.

Validation

Locally verified:

  • git diff --check passes.
  • The change is limited to the existing graph user-object destructor path in
    _utils.pyx / _utils.pxd.
  • The commit is DCO-signed and SSH-signed.

Not fully verified on this machine:

  • full cuda.core test suite
  • editable build of cuda_core

Related Work

@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot Bot commented May 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the cuda.core Everything related to the cuda.core module label May 12, 2026
Comment thread cuda_core/cuda/core/graph/_utils.pyx Outdated
@mdboom mdboom requested a review from Andy-Jost May 13, 2026 12:48
Comment thread cuda_core/cuda/core/graph/_utils.pyx Outdated
Copy link
Copy Markdown
Contributor

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

It’d be nice to add a comment like this in cuda_core/cuda/core/_cpp/resource_handles.hpp next to the existing py_is_finalizing() helper. It documents an important subtlety from CPython shutdown semantics: the check is only best-effort, and there is still a narrow race where non-finalizer threads can be killed on older CPython or hang on newer CPython. That means teardown in this layer can still be cut short in rare cases, so the guard reduces the risk but does not fully eliminate it.

  // Best-effort probe for interpreter shutdown.
  //
  // In CPython this is not a hard guarantee: finalization can begin after this
  // returns false but before a later PyGILState_Ensure() / other Python C-API
  // call.
  //
  // If that race is lost on a non-finalizer thread, CPython's behavior is
  // version-dependent: on older supported versions (3.10-3.13) it may abruptly the
  // current thread (historically via PyThread_exit_thread(), i.e. without normal
  // C++ unwinding), while on newer versions (3.14+) it may hang the thread until
  // process exit.
  //
  // We still use this check because the policy in this layer is to avoid Python
  // work once shutdown is underway and accept an intentional leak / skipped
  // Python conversion in that edge case rather than add more complex deferral
  // machinery.
  inline bool py_is_finalizing() noexcept {
  #if PY_VERSION_HEX >= 0x030D0000
      return Py_IsFinalizing();
  #else
      return _Py_IsFinalizing() != 0;
  #endif
  }

Signed-off-by: Aryan <aryansputta@gmail.com>
Copy link
Copy Markdown
Contributor

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rwgk rwgk added the P0 High priority - Must do! label May 14, 2026
@rwgk rwgk added this to the cuda.core next milestone May 14, 2026
@rwgk rwgk added the bug Something isn't working label May 14, 2026
@rwgk
Copy link
Copy Markdown
Contributor

rwgk commented May 14, 2026

/ok to test 327b53e

@github-actions
Copy link
Copy Markdown

@aryanputta
Copy link
Copy Markdown
Contributor Author

@rwgk, I’ve updated the resource handles to include the suggested comments on shutdown semantics. I believe all feedback has been addressed; please let me know if any additional modifications are needed before merging.

@rwgk
Copy link
Copy Markdown
Contributor

rwgk commented May 14, 2026

This looks ready to merge to me, but @Andy-Jost is the original author of the changed code, I'm waiting to give him a chance to approve, too.

@rwgk
Copy link
Copy Markdown
Contributor

rwgk commented May 14, 2026

@aryanputta at this stage it's best if you don't merge main anymore, unless the UI here reports conflicts.

Each time you merge main, we have to rerun the CI again.

@aryanputta
Copy link
Copy Markdown
Contributor Author

@aryanputta at this stage it's best if you don't merge main anymore, unless the UI here reports conflicts.

Each time you merge main, we have to rerun the CI again.

Okay, sorry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cuda.core Everything related to the cuda.core module P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Harden _py_host_destructor against invocation after Py_Finalize

4 participants