⚡ The Ultimate Proxy

The only proxy you need for Claude Code CLI

Quick Start • Features • Web Dashboard • Crosstalk • Compression Stack • Roadmap • Changelog

Route Claude Code to any provider. Save 90% on API costs. Run locally for free.

🌟 What Is It?

The Ultimate Proxy sits between Claude Code CLI and your chosen API provider. It translates Anthropic's API format to OpenAI-compatible format, letting you use any model with Claude Code:

Claude Code CLI  →  The Ultimate Proxy  →  Any Provider
                                           ├─ OpenRouter
                                           ├─ Gemini / VibeProxy
                                           ├─ OpenAI
                                           ├─ Azure
                                           ├─ Ollama (local)
                                           └─ LM Studio (local)

🚀 Quick Start

Option 1: Unified Installation (Recommended)

Single command installs everything with automatic GPU detection:

curl -fsSL https://raw.githubusercontent.com/aaaronmiller/claude-code-proxy/main/install-all.sh | bash

The installer will:

🔍 Auto-detect your GPU (NVIDIA CUDA, Intel Arc/iGPU, AMD ROCm, or CPU-only)
💬 Prompt you to confirm or override the detected GPU backend
✅ Clone claude-code-proxy
✅ Install Headroom (compression layer)
✅ Install RTK (command compression)
✅ Install GPU compute stack (Level Zero for Intel, CUDA for NVIDIA, ROCm for AMD)
✅ Configure environment variables (ONEAPI_DEVICE_SELECTOR, CUDA_VISIBLE_DEVICES, etc.)
✅ Add compression aliases (cc, qw, qw-resume)
✅ Start all services
✅ Show health status

Supported GPU backends:

Backend	Hardware	Env Vars Set
NVIDIA CUDA	RTX 30/40/50 series, datacenter GPUs	`CUDA_VISIBLE_DEVICES=0`
Intel Arc / iGPU	Arc A370M/A580/A770, Iris Xe, UHD	`ONEAPI_DEVICE_SELECTOR=level_zero:0`, `LIBVA_DRIVER_NAME=iHD`
AMD ROCm	RX 6000/7000, Instinct MI series	`HSA_OVERRIDE_GFX_VERSION=10.3.0`
CPU-only	No GPU available	`HEADROOM_DEVICE=cpu`

Manual installation:

# Clone & install
git clone https://github.com/aaaronmiller/claude-code-proxy.git ~/code/claude-code-proxy
cd ~/code/claude-code-proxy
./install-all.sh

Using with Claude Code

In a new terminal, run:

export ANTHROPIC_BASE_URL=http://localhost:8082
export ANTHROPIC_API_KEY=pass
claude

If you want the proxy itself to require a client key, set PROXY_AUTH_KEY. ANTHROPIC_API_KEY is for the Claude Code client side and no longer enables proxy auth by default.

💡 Pro Tip: Add aliases to your ~/.zshrc for easier access (see Setup Guide)

🌌 VibeProxy + Antigravity (Free Premium Models)

The easiest way to use Claude Code with premium models for free:

Install VibeProxy: Download
Authenticate: Sign in with Google (Antigravity OAuth)
Setup: python start_proxy.py --setup → Select "VibeProxy/Antigravity"

What you get:

🧠 Claude Opus 4.5 with 128k thinking tokens
⚡ Gemini 3 Pro/Flash
📊 BIG/MIDDLE/SMALL model routing
📈 Usage tracking & analytics
💸 No API keys or billing required

✨ Features

Feature	Description
Multi-Provider	OpenRouter, Gemini, OpenAI, Azure, Ollama, LM Studio
Model Routing	BIG/MIDDLE/SMALL tiers with intelligent routing
Web Dashboard	Real-time monitoring with 2025 glassmorphism UI
Crosstalk	Model-to-model conversations (up to 8 models)
Usage Tracking	Cost analytics and token metrics
Extended Thinking	Up to 128k thinking tokens for reasoning models
Prompt Injection	Add custom prompts showing routing info
Model Cascade	Automatic fallback on provider errors

🎨 Web Dashboard

Access at http://localhost:8082 when running.

2025 Glassmorphism UI with aurora gradients
Real-time request monitoring
Provider configuration
Model selection with hybrid routing
WebSocket live log streaming
Profile management

🗣️ Crosstalk

Multi-model conversations where AI models talk to each other:

# Launch visual TUI
python start_proxy.py --crosstalk-studio

# Quick setup
python start_proxy.py --crosstalk "claude-opus,gemini-pro" --topic "Explore consciousness"

Features:

Up to 8 models in a conversation
Multiple paradigms: relay, debate, memory, report
Jinja templates for message formatting
Backrooms-compatible import/export
MCP integration for programmatic access

Read the Crosstalk Guide →

🛠️ CLI Commands

# Configuration
python start_proxy.py --setup           # First-time wizard
python start_proxy.py --settings        # Unified settings TUI
python start_proxy.py --doctor          # Health check + auto-fix

# Model management
python start_proxy.py --select-models   # Interactive model selector
python start_proxy.py --set-big MODEL   # Quick set BIG model
python start_proxy.py --show-models     # List available models

# Diagnostics
python start_proxy.py --config          # Show configuration
python start_proxy.py --dry-run         # Validate without starting
python start_proxy.py --analytics       # View usage stats

📁 Project Structure

the-ultimate-proxy/
├── start_proxy.py          # Main entry point
├── .env                    # Configuration
├── CROSSTALK.md           # Multi-model chat docs
│
├── src/
│   ├── core/              # Proxy core logic
│   ├── api/               # FastAPI routes + WebSocket
│   ├── services/          # Providers, models, prompts
│   └── cli/               # CLI tools and TUIs
│
├── configs/crosstalk/     # Crosstalk presets & templates
├── web-ui/                # Svelte + bits-ui dashboard
└── docs/                  # Extended documentation

🐛 Troubleshooting

401 Unauthorized

python start_proxy.py --doctor  # Auto-fix API keys

Model Not Found

python start_proxy.py --show-models  # List available models

Connection Refused

Ensure proxy is running: python start_proxy.py
Check port 8082 is not in use

🗜️ Compression Stack

NEW (April 2026): Integrated compression layer for 95-99% cost savings

Quick Start

# Start compression stack
cs-start

# Use Claude with auto-compression
csi  # or csr to resume

# Quick stats
cs-stats-quick

Features

97% compression rate (900→26 tokens)
GPU acceleration — NVIDIA CUDA, Intel Arc (Level Zero), AMD ROCm, or CPU-only
Multi-CLI support (Claude, Qwen, Codex, OpenCode, OpenClaw, Hermes)
Real-time dashboard (terminal + web at :8899)
Multi-tier compression — default (:8787) + small model (:8790)
Semantic caching — deduplicates repeated context across turns
RTK command compression — 2-line summaries for CLI output

Architecture

Claude Code → Proxy (:8082) → Headroom (:8787) → OpenRouter
                           ↓
                    GPU: 92% VRAM
                    97% compression

Documentation

Full Guide: docs/COMPRESSION-STACK.md
Performance Analysis: docs/PERFORMANCE-ANALYSIS.md
GPU Optimization: docs/GPU-OPTIMIZATION.md
Roadmap: ROADMAP.md
Changelog: changelog.md

Low-VRAM / No-GPU Support

The installer handles this automatically:

Intel Arc A370M (4GB) — Level Zero runtime, ONEAPI_DEVICE_SELECTOR=level_zero:0
CPU-only — fallback when no GPU is detected, runs entirely on CPU
WSL2 passthrough — GPU exposed via /dev/dxg from Windows host

For manual configuration, see docs/GPU-OPTIMIZATION.md

🔮 Roadmap

See full roadmap: ROADMAP.md

Phase 1: Parallel Installation (April 2026) ✅ COMPLETE

Single install script with GPU auto-detection (install-all.sh v2.0)
Unified start/stop commands
Low-VRAM mode (Intel Arc A370M 4GB via Level Zero)
No-GPU mode (CPU-only fallback)
Network proxy access (HTTP/HTTPS)
Compression monitoring dashboard (:8899)
RTK integration (v0.34.3)

Phase 2: Tight Integration (May 2026)

Shared state management
Unified health monitoring
Log aggregation

Phase 3: Full Merger (Q3 2026)

Headroom as proxy middleware
Single unified proxy process
Resource optimization

The Ultimate Proxy • Made with ❤️ for the Claude Code community

Report Bug • Request Feature

Name		Name	Last commit message	Last commit date
Latest commit History 227 Commits
.build-artifacts		.build-artifacts
.factory		.factory
.github/workflows		.github/workflows
.rtk		.rtk
SNAKESKIN		SNAKESKIN
audit-reports		audit-reports
batch/hardcoded_model_audit_001		batch/hardcoded_model_audit_001
cli		cli
compression		compression
config		config
configs/crosstalk		configs/crosstalk
data		data
deploy/docker		deploy/docker
docs		docs
model-scraper		model-scraper
node_modules		node_modules
profiles		profiles
scripts		scripts
specs/001-claude-code-middleware-gateway		specs/001-claude-code-middleware-gateway
src		src
tests		tests
tools		tools
web-ui		web-ui
.codeiumignore		.codeiumignore
.codex		.codex
.dockerignore		.dockerignore
.env.example		.env.example
.envrc		.envrc
.geminiignore		.geminiignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.windsurfignore		.windsurfignore
CHANGELOG.md		CHANGELOG.md
CHEATSHEET.md		CHEATSHEET.md
CLAUDE.md		CLAUDE.md
CLIPproxy-cheatsheet.md		CLIPproxy-cheatsheet.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
ROADMAP.md		ROADMAP.md
USERPROMPTS.md		USERPROMPTS.md
ai		ai
changelog-2.md		changelog-2.md
compress-monitor-web.py		compress-monitor-web.py
cs-dashboard.py		cs-dashboard.py
extract_prompts.py		extract_prompts.py
find_headroom.sh		find_headroom.sh
fix_ghosts.py		fix_ghosts.py
generate_shame.py		generate_shame.py
inspect_db.py		inspect_db.py
inspect_db_safe.py		inspect_db_safe.py
inspect_db_safe_2.py		inspect_db_safe_2.py
inspect_db_tmp.py		inspect_db_tmp.py
inspect_usage_db.py		inspect_usage_db.py
install-all.sh		install-all.sh
output.txt		output.txt
output_tmp.txt		output_tmp.txt
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
poem.txt		poem.txt
proxies		proxies
proxy.db		proxy.db
proxy.pid		proxy.pid
pyproject.toml		pyproject.toml
quickstart		quickstart
quickstart.py		quickstart.py
requirements.txt		requirements.txt
restart_chain.sh		restart_chain.sh
restart_proxy.sh		restart_proxy.sh
run_proxy_bg.sh		run_proxy_bg.sh
session.md		session.md
start_proxy.py		start_proxy.py
telemetry-id		telemetry-id
test.txt		test.txt
test_output.txt		test_output.txt
test_result.txt		test_result.txt
test_startup.sh		test_startup.sh
usage_tracking_copy.db		usage_tracking_copy.db
usage_tracking_copy_2.db		usage_tracking_copy_2.db
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ The Ultimate Proxy

🌟 What Is It?

🚀 Quick Start

Option 1: Unified Installation (Recommended)

Using with Claude Code

🌌 VibeProxy + Antigravity (Free Premium Models)

✨ Features

🎨 Web Dashboard

🗣️ Crosstalk

🛠️ CLI Commands

📁 Project Structure

🐛 Troubleshooting

🗜️ Compression Stack

Quick Start

Features

Architecture

Documentation

Low-VRAM / No-GPU Support

🔮 Roadmap

Phase 1: Parallel Installation (April 2026) ✅ COMPLETE

Phase 2: Tight Integration (May 2026)

Phase 3: Full Merger (Q3 2026)

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ The Ultimate Proxy

🌟 What Is It?

🚀 Quick Start

Option 1: Unified Installation (Recommended)

Using with Claude Code

🌌 VibeProxy + Antigravity (Free Premium Models)

✨ Features

🎨 Web Dashboard

🗣️ Crosstalk

🛠️ CLI Commands

📁 Project Structure

🐛 Troubleshooting

🗜️ Compression Stack

Quick Start

Features

Architecture

Documentation

Low-VRAM / No-GPU Support

🔮 Roadmap

Phase 1: Parallel Installation (April 2026) ✅ COMPLETE

Phase 2: Tight Integration (May 2026)

Phase 3: Full Merger (Q3 2026)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages