Skip to content

[BUG][cudf_polars] distributed cluster creates a new Context for every query run #21989

@nirandaperera

Description

@nirandaperera

Describe the bug

https://github.com/rapidsai/cudf/blob/main/python/cudf_polars/cudf_polars/experimental/rapidsmpf/dask.py#L173-L183

Dask cluster mode creates a new context for every query run. This also creates a new Options object which is different to the Options object used to create BufferResource.

Steps/Code to reproduce bug

Expected behavior
Since we dont use tasks anymore, we should create a rapidsmpf Context and stash it in the Dask worker context similar to BufferResource and reuse it for every query

Environment overview (please complete the following information)

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]
  • Method of cuDF install: [conda, Docker, or from source]
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcudf-polarsIssues specific to cudf-polars

Type

No type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions