Skip to content

fix(crewai-files): avoid pickle serialization in upload cache#5950

Open
White-Mouse wants to merge 2 commits into
crewAIInc:mainfrom
White-Mouse:codex/crewai-files-upload-cache-json
Open

fix(crewai-files): avoid pickle serialization in upload cache#5950
White-Mouse wants to merge 2 commits into
crewAIInc:mainfrom
White-Mouse:codex/crewai-files-upload-cache-json

Conversation

@White-Mouse
Copy link
Copy Markdown

@White-Mouse White-Mouse commented May 27, 2026

What

crewai-files UploadCache supports external cache backends (e.g. redis) for distributed setups. Today it uses aiocache.serializers.PickleSerializer for cached upload metadata.

This PR replaces that with a small JSON serializer that round-trips the CachedUpload dataclass, including datetime fields, without using pickle.

Why

Pickle is unsafe for untrusted inputs. When caches are shared or external, a poisoned cache value should not be able to turn into code execution via deserialization.

Compatibility

Unreadable cache entries are now treated as cache misses. This includes old pickle bytes, malformed JSON, non-object JSON payloads, and JSON objects missing required CachedUpload fields. Valid JSON cache entries still round-trip normally.

Verification

  • uv run ruff check lib/crewai-files/src/crewai_files/cache/upload_cache.py lib/crewai-files/tests/test_upload_cache.py
  • uv run ruff format --check lib/crewai-files/src/crewai_files/cache/upload_cache.py lib/crewai-files/tests/test_upload_cache.py
  • uv run pytest -q lib/crewai-files/tests/test_upload_cache.py

Summary by CodeRabbit

  • Refactor

    • Upload cache now uses a JSON-based serializer for cached uploads, improving readability and interoperability.
    • Legacy or unreadable cache payloads are now safely treated as cache misses.
  • Tests

    • Added tests verifying JSON round-trip serialization and that malformed or legacy payloads are handled as cache misses.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: aefcba8b-a0da-416d-9998-c48623536f20

📥 Commits

Reviewing files that changed from the base of the PR and between 2c691ac and 3e9caf9.

📒 Files selected for processing (2)
  • lib/crewai-files/src/crewai_files/cache/upload_cache.py
  • lib/crewai-files/tests/test_upload_cache.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/crewai-files/src/crewai_files/cache/upload_cache.py

📝 Walkthrough

Walkthrough

Swap pickle serialization for a custom JSON serializer in the upload cache. Implement _CachedUploadSerializer that serializes CachedUpload (ISO datetimes and optional expires_at), register it for both Redis and in-memory aiocache backends, and add tests for round-trip and invalid payload handling.

Changes

Upload Cache Serialization

Layer / File(s) Summary
Serializer Foundation - Import and Implementation
lib/crewai-files/src/crewai_files/cache/upload_cache.py
Module imports updated to include json, logging, and BaseSerializer; added _CachedUploadSerializer implementing dumps/loads to convert CachedUpload to/from JSON with uploaded_at and optional expires_at formatted via isoformat()/fromisoformat(), and defensive handling that returns None for unreadable or invalid payloads.
Cache Backend Integration
lib/crewai-files/src/crewai_files/cache/upload_cache.py
UploadCache initialization for both Redis and in-memory aiocache backends now configures serializer=_CachedUploadSerializer() instead of the previous PickleSerializer().
Serializer Tests
lib/crewai-files/tests/test_upload_cache.py
New TestCachedUploadSerializer tests: test_json_round_trip asserts JSON serialize→deserialize produces an equivalent CachedUpload; test_unreadable_payloads_are_cache_misses asserts that legacy pickle payloads and malformed/non-dict JSON return None from loads.

🎯 4 (Complex) | ⏱️ ~45 minutes

🐰 A nibble of code, a careful convert,
JSON hopped in where pickle once spurt.
Dates in ISO, tidy and bright,
Old blobs return None, new ones take flight.
Hooray — the cache is sleeping well tonight. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main change: replacing pickle serialization with JSON serialization in the upload cache, which is the core objective of this PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai-files/src/crewai_files/cache/upload_cache.py`:
- Around line 68-76: The loads method currently raises on malformed or non-JSON
payloads which turns unreadable cache entries into failures; modify loads (in
class/method loads) to treat unreadable entries as cache misses by wrapping the
json.loads call, the type check, and the call to _from_json in a try/except and
return None on any decode/validation errors (e.g., json.JSONDecodeError,
TypeError, ValueError, KeyError, or any exception from _from_json) instead of
raising; keep the None checks for empty value but ensure any malformed or
unexpected payload simply results in returning None.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e826dc5f-c847-47a6-9d7e-bd7af7712c71

📥 Commits

Reviewing files that changed from the base of the PR and between c5ea415 and 2c691ac.

📒 Files selected for processing (1)
  • lib/crewai-files/src/crewai_files/cache/upload_cache.py

Comment thread lib/crewai-files/src/crewai_files/cache/upload_cache.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants