A three-node Proxmox cluster running media automation, gaming (with GPU passthrough + game streaming), AI/ML workloads, and self-hosted productivity tools — all on consumer hardware.
This repo documents the architecture, services, and lessons learned. No credentials or personal info — just the blueprint.
┌─────────────────────────────────────────────────────────────────────────────┐
│ Proxmox VE Cluster ("HomeServer") │
│ 3 nodes · PVE 9.1.1 │
├──────────────────┬──────────────────────┬───────────────────────────────────┤
│ Node: pve │ Node: MediaServer │ Node: AIServer │
│ (Gaming/Dev) │ (Media Stack Host) │ (AI/ML Workloads) │
│ │ │ │
│ CPU: i7-9700K │ CPU: Ryzen 7 8845HS │ CPU: Ryzen AI MAX+ 395 │
│ RAM: 32 GB │ RAM: 28 GB │ RAM: 128 GB │
│ GPU: RTX 2070* │ iGPU: Radeon 780M │ iGPU: Radeon 8060S │
│ │ │ │
│ ┌────────────┐ │ ┌────────────────┐ │ ┌───────────────────────────┐ │
│ │ VM 103 │ │ │ LXC 200 │ │ │ LXC 101 Dev Workspace │ │
│ │ Bazzite │ │ │ Docker Host │ │ │ LXC 102 Ollama + WebUI │ │
│ │ Gaming VM │ │ │ 35+ containers │ │ │ LXC 104 Work Env │ │
│ │ 4c/24GB │ │ │ 12c/24GB │ │ │ LXC 105 ML Research │ │
│ │ through │ │ │ + nginx SSO │ │ │ │ │
│ │ │ │ │ + SearXNG │ │ │ │ │
│ └────────────┘ │ └────────────────┘ │ ├───────────────────────────┤ │
│ │ │ │ Homelab API :9105 │ │
│ * Only GPU in │ DAS: 8TB btrfs │ │ └─ AI Agent (Jarvis) │ │
│ system — │ (USB TerraMaster) │ │ └─ Download Guardian │ │
│ host goes │ │ │ └─ Library Verification │ │
│ headless │ │ │ └─ Diagnostic Tools │ │
│ when VM runs │ │ │ Doc RAG :9103 │ │
│ │ │ │ Terraform :9104 │ │
│ │ │ └───────────────────────────┘ │
├──────────────────┴──────────────────────┴───────────────────────────────────┤
│ │
│ ┌── AI Agent Brain ──────────────────────────────────────────────────┐ │
│ │ qwen3.5:35b-a3b on Ollama (native tool calling, 64+ tools) │ │
│ │ │ │
│ │ Interfaces: │ │
│ │ Discord bot (*ai) ──┐ │ │
│ │ Homepage chat ──────┼── /api/ai/jarvis ── tool loop ── execute │ │
│ │ Open WebUI (MCP) ──┘ │ │
│ │ │ │
│ │ Subsystems: │ │
│ │ Librarr (Go, 13 sources, Torznab/Newznab, OPDS, embedded UI) │ │
│ │ Sentinel (Go, download guardian, library verification) │ │
│ │ Diagnostics (file ops, log reading, library rescans) │ │
│ │ SearXNG (self-hosted web search) ─── Open WebUI + Homepage │ │
│ │ Homelab Agent (proactive: 7 modules, 3-tier AI repair, │ │
│ │ every 5min, port 9106) │ │
│ │ Nightly Tests (88 tests at 5 AM, Discord results) │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
| Node | CPU | Cores/Threads | RAM | GPU | Role |
|---|---|---|---|---|---|
| pve | Intel i7-9700K | 8c/8t | 32 GB | NVIDIA RTX 2070 (passthrough) | Gaming / dev |
| MediaServer | AMD Ryzen 7 8845HS | 8c/16t | 28 GB | AMD Radeon 780M (iGPU) | Media stack |
| AIServer | AMD Ryzen AI MAX+ 395 | 16c/32t | 128 GB | AMD Radeon 8060S (iGPU) | AI/ML workloads |
- Boot drives: Local LVM-thin on each node (~100 GB each)
- DAS: TerraMaster TDAS enclosure, USB-attached to MediaServer, 8 TB btrfs
- Mounted at
/mnt/storageon the MediaServer host - Bind-mounted into LXC 200 at
/data/media - All media services depend on this mount — they won't start if the DAS is disconnected
- Mounted at
Internet
│
├── Cloudflare Tunnel (cloudflared container)
│ └── Reverse proxy to select services
│
├── Tailscale mesh (node-to-node, stable IPs)
│
└── LAN (flat /24 network)
│
├── pve node
│ └── VM 103 (Bazzite) — bridged LAN + Tailscale
│
├── MediaServer node
│ └── LXC 200 — bridged LAN
│ ├── nginx reverse proxy (*.homelab.internal)
│ │ └── Authelia SSO (3-tier auth)
│ ├── gluetun VPN (Mullvad WireGuard)
│ │ ├── qBittorrent
│ │ ├── Librarr
│ │ └── Gamarr
│ ├── dnsmasq (local DNS for *.homelab.internal)
│ └── SearXNG (self-hosted web search)
│
└── AIServer node
├── LXC 101-106 — bridged LAN
├── Homelab API + AI Agent (port 9105)
└── MCP server (Proxmox management)
Download clients (qBittorrent, Librarr, Gamarr) route through a gluetun container running Mullvad WireGuard. Services that need VPN protection use network_mode: "service:gluetun" in Docker Compose and expose their ports through gluetun.
The homelab is controlled by a tool-calling AI agent powered by local LLMs (qwen3.5:35b-a3b / gemma4:e4b) running on Ollama with GPU-accelerated inference via the AMD 8060S iGPU's GTT unified memory. The agent has 66+ tools for managing every aspect of the homelab, and hits a 10/10 mean score on the internal eval harness across a canned set of real-world prompts.
On top of the basic tool-calling loop, the stack adds:
- Semantic tool routing — embedding-similarity hybrid replaces keyword matching (catches "prove it" →
verify_in_library) - Episodic memory — summaries of past conversations are embedded and retrieved on new messages, so context persists across sessions and interfaces
- LLM observability — every Ollama call traced to SQLite with latency / token / tool-success metadata, visible in the PWA
- Code execution sandbox —
execute_codetool runs Python in a hardened bubblewrap namespace (no network, 5s CPU, 512MB RAM, fs-isolated) - Unified homelab RAG — ChromaDB ingest of Sonarr/Radarr/Jellyfin/git/agent-failures for "ask anything about my homelab"
- Tier 2 verify step — after the smart fixer declares a fix, syntax/container-health/LLM-judge checks run; any failure reverts file edits from backups
- Eval harness — canned prompts + LLM judge nightly, with a regression gate in the nightly test suite
A proactive Homelab Agent with 7 modules scans every 5 minutes and uses a 3-tier AI repair system (qwen3:1.7b fast tools → qwen3.5:35b smart fixer with verify → Claude Code backstop) to autonomously detect and fix issues.
User (Discord / Homepage / Open WebUI)
└── /api/ai/jarvis
└── LLM decides which tools to call
└── Executes against homelab APIs
└── Feeds results back to LLM
└── Generates natural language response
All three interfaces share the same agent brain:
| Interface | How | Use Case |
|---|---|---|
| Discord bot | *ai <anything> command |
Mobile / quick commands |
| Homepage widget | Floating chat bubble (custom.js) | Dashboard integration |
| Open WebUI | MCP tools proxy | Full chat UI with history |
| System | Purpose |
|---|---|
| Librarr | Go binary (17 MB), 13 search sources, Torznab/Newznab API, OPDS feed, Usenet/SABnzbd, modern Tailwind dark UI, series grouping, wishlist |
| Sentinel | Go binary (11 MB), download guardian with SQLite persistence, definitive library verification |
| Homelab Agent | Proactive monitoring (5min), 7 modules (container doctor, source intelligence, import watchdog, torrent doctor, system monitor, notifications, AI escalation), 3-tier repair system, failure memory |
| Diagnostic Tools | File ops, log reading, permission fixes, library rescans — for AI escalation |
| SearXNG | Self-hosted web search for AI agent, Homepage, Open WebUI |
| Paperless Tagging | AI-driven document tagging and correspondent assignment |
| Gaming API | Game search, ROM download, sync status, Bazzite VM control |
| Nightly Tests | 88 end-to-end tests at 5 AM (~60s), Discord results notification |
See AI Stack for full details.
| VMID | Name | Node | Type | Resources | Purpose |
|---|---|---|---|---|---|
| 101 | project-env | AIServer | LXC | 4c / 4 GB | Development workspace |
| 102 | openclaw | AIServer | LXC | 16c / 28 GB | Local LLM chat (Ollama + Open-WebUI) |
| 103 | gaming-bazzite | pve | VM | 7c / 24 GB | Gaming VM with GPU passthrough |
| 104 | work-env | AIServer | LXC | 4c / 4 GB | Claude Code, Docker, dev tools |
| 105 | research-env | AIServer | LXC | 16c / 16 GB | AI/ML research with GPU passthrough |
| 200 | docker-server | MediaServer | LXC | 12c / 24 GB | Main Docker host (55+ containers) |
| Doc | Description |
|---|---|
| Docker Services | All 55+ containers running on LXC 200 |
| Gaming VM | Bazzite setup, GPU passthrough, Sunshine/Moonlight streaming |
| Game Pipeline | Automated game download → install → Steam library pipeline |
| AI Stack | Tool-calling agent, Download Guardian, verification, diagnostics, RAG, SearXNG, Homelab Agent, nightly tests |
| Automation | Download Guardian, Homelab Agent, backups, nightly tests, CrowdSec, Terraform, dual-channel alerts |
| Monitoring | Homelab Agent (7 modules, 3-tier AI repair), n8n watchdog workflows, Homepage dashboard, storage monitoring |
| Media Stack | Jellyfin, *arr apps, download automation |
| Networking | VPN, Cloudflare tunnel, Tailscale mesh, nginx + Authelia SSO |
| Lessons Learned | Gotchas, debugging tips, things that broke |
| Docker Compose (example) | Sanitized compose file |
- 7 guests across 3 nodes (6 LXC + 1 VM)
- 55+ Docker containers on a single LXC
- ~188 GB total RAM across the cluster
- 8 TB DAS for media storage
- GPU passthrough on 2 nodes (NVIDIA for gaming, AMD iGPU shared across 3 LXCs for ML)
- AI tool-calling agent — 66+ tools, local LLMs (qwen3.5:35b-a3b + gemma4:e4b), GPU-accelerated via GTT unified memory, 10/10 stable on internal eval harness
- Semantic tool routing — embedding-similarity tool retrieval (catches paraphrases the old keyword router missed); hybrid with keyword hits as a baseline floor
- Episodic memory — past chats are summarized + embedded + retrieved cross-interface, so the assistant remembers context between Discord, PWA, and Open WebUI sessions
- LLM observability — SQLite trace of every Ollama call (latency/tokens/tool success/errors); real-time stats rendered on the mobile PWA
- Code execution sandbox — Python/bash tool runs in a bubblewrap-isolated namespace (no network, fs-isolated, resource-capped, timeout-enforced)
- Unified homelab RAG — ChromaDB ingest of Sonarr/Radarr/Jellyfin/git/agent-failures; natural-language queries against every source with
?source=filter - Tier 2 verify step — smart fixer's fixes are independently validated (syntax / container health / LLM judge); file edits auto-revert from backup on failure
- Eval harness — 10 canned prompts replayed nightly with LLM judge scoring, SQLite history, regression gate in nightly tests
- 4 agent interfaces — Discord bot, Homepage chat widget, mobile PWA, Open WebUI (same brain, same tools)
- Librarr (Go) — 18 MB binary, 13 search sources, Torznab/Newznab API, OPDS feed, Usenet/SABnzbd, multi-user with TOTP 2FA + OIDC/SSO, modern dark Tailwind UI with series grouping and wishlist
- Sentinel (Go) — 11 MB binary, download guardian with SQLite persistence, definitive library verification (Jellyfin/ABS/Kavita/Sonarr/Radarr)
- Homelab Agent — proactive monitoring every 5min, 7 modules (container doctor, source intelligence, import watchdog, torrent doctor, system monitor, notifications, AI escalation), 3-tier AI repair system, failure memory (SQLite)
- Service integrations — Mealie recipe import, Changedetection URL watches, Linkwarden bookmarks, AI auto-tagging for Paperless, Docker container control (restart/stop/start)
- 100+ nightly tests — comprehensive end-to-end tests at 5 AM, covers all services + smart fixer + escalation + AI stack (traces/memory/sandbox/RAG/evals/semantic routing), plus an eval-score regression gate; 128 unit tests across homelab-api/doc-rag/homelab-agent, Discord results notification
- SearXNG — self-hosted web search for AI agent, Homepage dashboard, Open WebUI
- Diagnostic toolkit — file ops, log reading, permission fixes, library rescans for AI escalation
- Unified API — single FastAPI endpoint aggregating all services (Swagger docs included)
- Document RAG — vector search over 169+ documents via local embeddings + LLM
- Automated backups — Restic to DAS, 4 nodes, daily, encrypted, deduplicated
- SSO reverse proxy — nginx + Authelia, 34 subdomains on
*.homelab.internal, 3-tier auth (true SSO / gate / passthrough), self-signed wildcard cert, dnsmasq for LAN + Tailscale split DNS for remote - CrowdSec IPS — 1400+ malicious IPs blocked at firewall, community threat intel
- Terraform IaC — entire cluster defined as code, importable state
- 9 n8n workflows — dual-channel Discord alerts, watchdogs, health checks
- AI self-healing — consolidated Homelab Agent with 3-tier repair (1.7b fast tools → 35b smart fixer → Claude Code backstop) auto-fixes containers, torrents, VPN, permissions, imports, configs
- Dual-channel Discord alerts — all watchdogs and bots report to both Discord servers
- Zero cloud dependencies — everything self-hosted (except Cloudflare tunnel for external access)
Custom Go services built for this homelab, available as standalone projects:
| Project | Language | Description |
|---|---|---|
| Librarr | Go | Book/audiobook/manga search + download, 13 sources, Torznab API, OPDS feed |
| Sentinel | Go | Download guardian with library verification (Jellyfin/ABS/Kavita/Sonarr/Radarr) |
| Gamarr | Go | Game/ROM search + download, 24 platforms, 3 sources, 43 e2e tests |
| Homelab Blueprint | Docs | This repo — architecture documentation |
MIT — use this as inspiration for your own homelab.