Intercept and inspect Coding Agent API traffic from Claude Code, Codex CLI, Gemini CLI, Cursor CLI, OpenCode, Kimi, Pi, and Hermes in a local trace viewer.
-
Updated
May 30, 2026 - Python
Intercept and inspect Coding Agent API traffic from Claude Code, Codex CLI, Gemini CLI, Cursor CLI, OpenCode, Kimi, Pi, and Hermes in a local trace viewer.
The open-source MultiAgentOps evaluation and verification harness for any industry business workflow.
🔍 AI observability skill for Claude Code. Debug LangChain/LangGraph agents by fetching execution traces from LangSmith Studio directly in your terminal.
Local open-source dev tool to debug, secure, and evaluate LLM agents. Provides static analysis, dynamic security checks, and runtime monitoring - integrates with Cursor and Claude Code.
Diagnose your AI agents in production. Extract policies from prompts, evaluate traces, generate diagnostic reports.
Cut your OpenClaw / ZeroClaw token bill. Find which model earns its cost. Prove whether optimizations actually work. Local, no upload.
Local replay debugger for Browser Use failures with screenshots, model I/O, failed-step timelines, and public-safe HTML exports.
Visual debugging, tracing, and replay for agent workflows.
🔍 A beautiful web viewer for AI agent session files. Browse Claude Code & OpenClaw conversations with chat-style UI, timeline visualization, and zero setup.
Compatibility and diagnostics for DeepSeek V4 tool-calling agents
ChainWatch is a flight data recorder for multi-step AI systems. It's a CLI-based tool that records every step in an AI decision chain, links them together in order, prevents tampering, and allows you to verify the chain's integrity and replay the full decision flow.
Android Agent Reliability Runtime A debugging and safety runtime for mobile GUI agents: detect readiness, block unsafe actions, verify progress, diagnose failures, and save reproducible traces.
Kaleidoskop — replay your baro/Mozaik agent runs visually. Audit log → hexagonal neural firing in your browser.
Explain why your agent failed — root-cause debugging, memory attribution, and run divergence for LLM agents.
AI agents fail like junior teammates—looping on bad ideas, ignoring feedback, escalating commitment. vstack ports 34 of the most-cited organizational-behavior frameworks so you can diagnose your agents the same way you'd diagnose your team.
A real-time observability and debugging layer for AI agents.
Self-hosted debugging for AI agent runs
TDD for AI agents — watch world state morph step-by-step. Drop-in for Vercel AI SDK / Anthropic SDK / LangChain. Scrubbable trajectories + bulk grid view.
RunLens helps teams compare and debug AI agent runs with step timelines, run diffs, and cost analysis.
Add a description, image, and links to the agent-debugging topic page so that developers can more easily learn about it.
To associate your repository with the agent-debugging topic, visit your repo's landing page and select "manage topics."