Skip to content

Add guidance on when to use ResumableJobMixin vs deferrable and async operators#67794

Closed
jangByeongHui wants to merge 7 commits into
apache:mainfrom
jangByeongHui:docs/67706-resumable-vs-deferrable
Closed

Add guidance on when to use ResumableJobMixin vs deferrable and async operators#67794
jangByeongHui wants to merge 7 commits into
apache:mainfrom
jangByeongHui:docs/67706-resumable-vs-deferrable

Conversation

@jangByeongHui
Copy link
Copy Markdown
Contributor

@jangByeongHui jangByeongHui commented May 31, 2026

Summary

Airflow 3.x now ships three mechanisms for handling long-running or blocking
work inside a task — deferrable operators, async Python tasks, and
ResumableJobMixin — but no documentation explains the trade-offs or when to
reach for each one.

This PR closes that gap by extending the existing Deferred vs Async Operators
page in the Task SDK docs with a full three-way comparison, and adding a short
cross-reference in the core deferring guide.

closes: #67706


Changes

task-sdk/docs/deferred-vs-async-operators.rst — primary

  • Page title updated from Deferred vs Async Operators to
    Deferred, Async, and Resumable Operators.
    The RST label sdk-deferred-vs-async-operators is preserved unchanged so
    existing external links continue to work.

  • Intro paragraph updated to mention all three patterns.
    A .. versionchanged:: 3.3.0 note is added, pointing readers to the new
    section.

  • New "Resumable Operators" section added before "When not to use…":

    • Explains the core mechanic: the mixin persists the external job identifier
      to task_state before polling; on retry it reconnects to the already-running
      job instead of submitting a duplicate.
    • Lists key characteristics: worker slot is held (not freed), no Triggerer
      required, duplicate submission prevented automatically.
    • Provides when-to-use and when-to-avoid guidance.
    • Includes a full code example showing how to subclass ResumableJobMixin,
      with a note that SparkSubmitOperator uses this pattern in practice.
  • New "Three-way Comparison" table (.. list-table::) comparing deferrable /
    async / resumable across six dimensions:

    Dimension Deferrable Async @task Resumable
    Worker slot during wait Freed Held Held
    Requires Triggerer Yes No No
    State passed on retry Via method_name/kwargs Not persisted Auto via task_state
    Duplicate prevention Manual Manual Automatic
    Ideal workload Single external event Many concurrent I/O ops Long-running remote job
    Available from Airflow 2.2 Airflow 3.2 Airflow 3.3
  • New bullet appended to the existing "When not to use" section advising
    against resumable operators when a Triggerer is available and the operator is
    being written from scratch.

airflow-core/docs/authoring-and-scheduling/deferring.rst — secondary

  • Updated the cross-reference note near the top (line 35) to mention resumable
    operators alongside deferred and async.
  • Appended a new .. _deferring/resumable: subsection after the existing
    mode='reschedule' vs deferrable=True comparison table, summarising the
    resumable pattern and linking to the Task SDK page for the full comparison.

task-sdk/docs/index.rst — tertiary

  • Updated the "Choosing Between" section heading and paragraph to describe all
    three patterns instead of two.

Testing

This PR changes documentation only. No Python source files were modified.

Pre-commit hooks (prek run --from-ref main --stage pre-commit) were run
locally and passed — including RST lint, codespell, and newsfragment validation.


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (claude-sonnet-4-6)

Generated-by: Claude Code (claude-sonnet-4-6) following the guidelines

@jangByeongHui jangByeongHui force-pushed the docs/67706-resumable-vs-deferrable branch from 1a9d264 to 20494dd Compare May 31, 2026 08:22
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Jun 1, 2026
@jangByeongHui
Copy link
Copy Markdown
Contributor Author

Hi @amoghrajesh — I know you're busy, but would you mind taking a look at this PR when you get a chance? It adds documentation comparing the three patterns for handling long-running tasks in Airflow 3.x (deferrable operators, async @task, and ResumableJobMixin), and any feedback from you would be really appreciated.
Also tagging @ashb and @kaxil for visibility — no rush at all, thank you!

@ashb
Copy link
Copy Markdown
Member

ashb commented Jun 4, 2026

ResumableJobMixin only exists in main, and will be released with 3.3.0


.. _deferring/resumable:

Resumable Operators (ResumableJobMixin)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resumable Operators are usable without using this mixin -- for instance if you are processing a list of files in S3 and want to carry on where you left off, using that job mixin would not be appropriate, but using resuable could be.

execution. Use it when:

- A Triggerer is not available, or the operator is already synchronous.
- The external system supports reconnecting to a running job via a stable identifier.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specific to the mixin, not an intrinsic requirement of when to use State/ Resumable operators.

This is not incorrect, but it's not complete.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this. Not relevant to call out like that.

Resumable Operators
-------------------

A *resumable operator* uses :class:`~airflow.sdk.ResumableJobMixin` to make a
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this isn't true. A resumable operator is one that reads from the task state store and makes decisions based on the data included in there.

@kaxil
Copy link
Copy Markdown
Member

kaxil commented Jun 4, 2026

Closing this, @amoghrajesh you'd be the best person to work on this doc. Others won't have that context.

Thanks for the PR @jangByeongHui but this needs more context from the AIP implementer !

@kaxil kaxil closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add guidance on when to use ResumableMixin vs deferrable operators

4 participants