feat: add session-start prompt-cache preload for crew kickoff (#5921) by devin-ai-integration[bot] · Pull Request #5922 · crewAIInc/crewAI

devin-ai-integration · 2026-05-25T06:46:09Z

Summary

Implements the session-start prompt-cache preload feature requested in #5921. Adds opt-in cache_preload and cache_preload_strategy parameters to the Crew class that fire lightweight 1-token cache-warming probes against each agent's system prompt at crew.kickoff() time.

This warms the provider's prompt cache (Anthropic prompt caching, OpenAI prefix caching, Gemini context caching) before the first real task runs, reducing first-step latency and cache-write costs for multi-agent crews.

Changes:

BaseLLM.preload_probe(system_prompt) — new method on the base LLM class that temporarily sets max_tokens=1 and temperature=0, then delegates to self.call(). This works with all LLM implementations (native providers + LiteLLM). Failures are logged as warnings and do not block execution.
Crew.cache_preload (bool, default False) — opt-in flag to enable cache warming at kickoff.
Crew.cache_preload_strategy (Literal, default "parallel") — three strategies:
- parallel — probes fired concurrently via a thread pool
- sequential — probes fired one-by-one in agent order
- shared_prefix — detects the common system-prompt prefix across agents; if ≥ 1024 chars, warms it once before per-agent suffixes; falls back to parallel otherwise
Crew._preload_caches() — internal method called during kickoff() after prepare_kickoff() completes (agents are fully initialized) but before the process runs. Only activates for crews with 2+ agents.
Crew._get_agent_system_prompt(agent) — helper that builds the exact system prompt for an agent using Prompts.task_execution().
Crew._common_prefix(strings) — utility to find the longest common character prefix.

Usage:

crew = Crew(
    agents=[a1, a2, a3],
    tasks=[t1, t2, t3],
    cache_preload=True,                    # opt-in
    cache_preload_strategy="parallel",     # or "sequential" / "shared_prefix"
)
crew.kickoff()

Review & Testing Checklist for Human

Verify the BaseLLM.preload_probe method correctly sends a 1-token completion via self.call() and does not raise on API errors (review lib/crewai/src/crewai/llms/base_llm.py)
Verify _preload_caches is called at the right point in kickoff() — after prepare_kickoff() but before process execution (review lib/crewai/src/crewai/crew.py)
Verify the shared_prefix strategy correctly falls back to parallel when the common prefix is < 1024 chars
Test with a real multi-agent crew (cache_preload=True) against Anthropic/OpenAI to confirm cache-warming reduces first-step latency
Confirm that cache_preload=False (default) does not change any existing behavior

Notes

Feature is fully opt-in — cache_preload=False by default, no behavioral changes for existing users
Single-agent crews skip preloading entirely (no-op)
preload_probe is defined on BaseLLM so it works with all LLM implementations (native OpenAI, Anthropic, Gemini, Azure, Bedrock, and LiteLLM fallback)
22 new tests added covering all strategies, field defaults, kickoff integration, and edge cases
All 126 existing crew tests and 56 LLM tests continue to pass

Link to Devin session: https://app.devin.ai/sessions/cd612f749eca4b80af1ebea64e832a8f

devin-ai-integration · 2026-05-25T06:46:11Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

Add opt-in cache_preload and cache_preload_strategy parameters to the Crew class that fire lightweight 1-token cache-warming probes against each agent's system prompt at kickoff time. This warms the provider's prompt cache (Anthropic, OpenAI prefix caching, etc.) before the first real task runs, reducing first-step latency and cache-write costs. Implementation: - BaseLLM.preload_probe(): sends max_tokens=1 completion with the agent's system prompt; failures are logged and never propagated - Crew.cache_preload / Crew.cache_preload_strategy fields - Crew._preload_caches() with three strategies: * parallel: concurrent probes via ThreadPoolExecutor * sequential: one-by-one in agent order * shared_prefix: warm common prefix once then per-agent suffixes; falls back to parallel when prefix < 1024 chars The feature is opt-in (cache_preload=False by default) and only activates for crews with 2+ agents. Co-Authored-By: João <joao@crewai.com>

Co-Authored-By: João <joao@crewai.com>

+
+from unittest.mock import MagicMock, patch
+
+import pytest


+             patch.object(crew, "_run_sequential_process", return_value=MagicMock()):
+            try:
+                crew.kickoff()
+            except Exception:


+             patch.object(crew, "_run_sequential_process", return_value=MagicMock()):
+            try:
+                crew.kickoff()
+            except Exception:


+             patch.object(crew, "_run_sequential_process", return_value=MagicMock()):
+            try:
+                crew.kickoff()
+            except Exception:


- Use explicit type annotation for original_max_tokens in preload_probe - Use self.__setattr__ to avoid type mismatch with subclass fields - Replace hasattr checks with isinstance(agent.llm, BaseLLM) for proper type narrowing - Ensure _get_agent_system_prompt returns str without Any leak Co-Authored-By: João <joao@crewai.com>

github-code-quality Bot found potential problems May 25, 2026

View reviewed changes

Comment thread tests/test_cache_preload.py Fixed

Comment thread lib/crewai/tests/test_cache_preload.py Fixed

Comment thread tests/test_cache_preload.py Fixed

devin-ai-integration Bot force-pushed the devin/1779691235-cache-preload-kickoff branch from 018b902 to 158d962 Compare May 25, 2026 07:01

github-actions Bot added the size/XL label May 25, 2026

style: apply ruff format fixes

2b60f3d

Co-Authored-By: João <joao@crewai.com>

github-code-quality Bot found potential problems May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add session-start prompt-cache preload for crew kickoff (#5921)#5922

feat: add session-start prompt-cache preload for crew kickoff (#5921)#5922
devin-ai-integration[bot] wants to merge 3 commits into
mainfrom
devin/1779691235-cache-preload-kickoff

devin-ai-integration Bot commented May 25, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot commented May 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

devin-ai-integration Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration Bot commented May 25, 2026

🤖 Devin AI Engineer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

devin-ai-integration Bot commented May 25, 2026 •

edited

Loading