fix(security): sanitize telemetry data + add SHA-256 verification for model downloads by MAXDVVV · Pull Request #1723 · openinterpreter/open-interpreter

MAXDVVV · 2026-04-07T19:00:46Z

Summary

This PR addresses two security concerns in Open Interpreter:

1. Telemetry Data Sanitization (`telemetry.py`)

Problem: The send_telemetry() function sends properties to PostHog without any sanitization. When exception stack traces or other debug data are included in telemetry properties, they may inadvertently leak:

Absolute file paths revealing usernames and directory structures
Environment variable values
API keys, tokens, or passwords embedded in error messages

Fix: Added a _sanitize_properties() layer that:

Strips absolute file paths (Unix /home/user/... and Windows C:\Users\...) → <path>
Redacts environment variable references ($HOME, %USERPROFILE%) → <env>
Redacts values of sensitive keys (api_key, password, token, etc.) → <redacted>
Recursively processes nested dicts and lists

2. Model Download Integrity Verification (`local_setup.py` + new `download_security.py`)

Problem: Model files (multi-GB executables) are downloaded via wget.download() from HuggingFace URLs with zero integrity verification. A network-level attacker (MITM) or compromised CDN could serve tampered model files that get chmod +x and executed.

Fix:

New download_security.py module with verify_model_integrity() function
Computes SHA-256 hash of downloaded files
Verifies against expected hash when available
Warns users when no hash is available (prints computed hash for manual verification)
Automatically removes files that fail integrity checks
Added sha256 field to all model entries in the model list (currently None — maintainers should populate with verified hashes)

Files Changed

File	Change
`interpreter/core/utils/telemetry.py`	Added sanitization layer before `requests.post()`
`interpreter/terminal_interface/download_security.py`	New — SHA-256 verification utility
`interpreter/terminal_interface/local_setup.py`	Import + call `verify_model_integrity()` after download

Testing

Sanitization is regex-based with no external dependencies
Hash verification uses only stdlib hashlib
No breaking changes — all existing behavior preserved
When sha256 is None, download proceeds with a warning (non-blocking)

Next Steps (for maintainers)

Populate the sha256 field for each model entry with verified checksums from HuggingFace. Example:

{
    "name": "Llama-3.1-8B-Instruct",
    "sha256": "abc123...",  # from HuggingFace model card
    ...
}

… credentials Added sanitization layer that: - Strips absolute file paths (Unix/Windows) from all telemetry properties - Redacts environment variable references ($HOME, %USERPROFILE%, etc.) - Redacts values of sensitive keys (api_key, password, token, etc.) - Recursively sanitizes nested dicts and lists This prevents accidental leakage of local file paths, credentials, or other PII that may appear in exception stack traces sent via telemetry.

New utility module that: - Computes SHA-256 checksums of downloaded model files - Verifies against expected hashes when available - Warns users when no hash is available for verification - Automatically removes files that fail integrity checks This addresses the risk of tampered or corrupted model files being downloaded and executed without any integrity verification.

- Import and call verify_model_integrity() after each wget.download() - Add 'sha256' field to all model entries (None for now — maintainers should populate with verified hashes from HuggingFace) - If hash verification fails, the corrupted file is automatically removed and download_model() returns None - When no hash is provided, a warning is printed with the computed hash so users can verify manually This prevents execution of tampered or corrupted model files that are downloaded from external sources without any integrity check.

MAXDVVV added 3 commits April 7, 2026 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): sanitize telemetry data + add SHA-256 verification for model downloads#1723

fix(security): sanitize telemetry data + add SHA-256 verification for model downloads#1723
MAXDVVV wants to merge 3 commits intoopeninterpreter:mainfrom
MAXDVVV:fix/telemetry-sanitize-and-model-hash-verification

MAXDVVV commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MAXDVVV commented Apr 7, 2026

Summary

1. Telemetry Data Sanitization (telemetry.py)

2. Model Download Integrity Verification (local_setup.py + new download_security.py)

Files Changed

Testing

Next Steps (for maintainers)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Telemetry Data Sanitization (`telemetry.py`)

2. Model Download Integrity Verification (`local_setup.py` + new `download_security.py`)