Observability for AI apps in production. See every LLM call your app makes — model, tokens, latency, errors — in a local dashboard. Zero cloud. Zero config.
If you're building software that calls AI APIs (OpenAI, Anthropic, OpenRouter, Groq, Ollama, etc.), agent-watch shows you exactly what's happening under the hood:
- Backend developers building chatbots, AI assistants, or automated workflows
- Startup teams running AI agents in production that need to debug failures fast
- Freelancers building AI-powered apps for clients who ask "why did it break?"
- Teams with AI pipelines processing documents, analyzing data, or generating content
Your AI app calls GPT-4 or Claude 50 times per request. One of those calls fails, returns garbage, or takes 10 seconds. Without agent-watch, you dig through logs for hours. With it, you open a dashboard and see the exact call that failed, why, and how long it took.
- End users chatting with ChatGPT (you don't need this)
- Non-technical users (this is a developer tool)
Run one command. No code changes needed:
npx agent-watch --target https://openrouter.ai/api/v1Then point your app to http://localhost:4201 instead of the API URL:
# Python
client = OpenAI(base_url="http://localhost:4201/v1")
# JavaScript
const client = new OpenAI({ baseURL: "http://localhost:4201/v1" });
# curl
curl http://localhost:4201/v1/chat/completions -H "Authorization: Bearer YOUR_KEY" ...Open http://localhost:4200 → see every call in real time.
Works with any language (Python, Go, Rust, Java, curl) and any provider:
npx agent-watch --target https://api.openai.com # OpenAI
npx agent-watch --target https://api.anthropic.com # Anthropic
npx agent-watch --target https://openrouter.ai/api/v1 # OpenRouter
npx agent-watch --target http://localhost:11434 # Ollama (local)
npx agent-watch --target https://api.groq.com/openai # Groq
npx agent-watch --target https://api.together.xyz # Together
npx agent-watch --target https://api.mistral.ai # MistralSecurity: The proxy only listens on
127.0.0.1(your machine). API keys are forwarded to the provider but never stored locally.
If your app is Node.js, just add one line at the top:
require('agent-watch/auto');Every OpenAI/Anthropic call is traced automatically. Run the dashboard with:
npx agent-watch serve✓ classify-ticket 0.8s gpt-4 tokens: 150 ok
✓ lookup-customer 0.2s — DB query ok
✗ generate-response ERROR gpt-4 rate_limit error
✓ retry-response 1.2s gpt-4 tokens: 340 ok
✓ send-email 0.1s — SMTP ok
Click any trace to see the full span tree — every step your agent took, with timing, token counts, and the exact error message when something breaks.
Open the dashboard, find the trace, see which LLM call returned unexpected output. Fix the prompt, not the whole app.
Every trace shows input/output token counts. See which agent or workflow burns the most API budget.
Latency per span. Instantly see if it's the LLM (2s), the database (50ms), or the tool call (5s timeout).
The span tree makes it obvious — 47 identical calls to the same tool. No more guessing from flat logs.
Regulated industries (finance, healthcare) need records of AI decisions. agent-watch stores the full decision tree with timestamps.
Compare traces before and after. See how token usage, latency, and outputs differ.
agent-watch --target <url> # Start proxy + dashboard
agent-watch serve --port 4200 # Dashboard only
agent-watch list # Recent traces
agent-watch list --status error # Only failures
agent-watch replay <trace-id> # Full span tree
agent-watch stats # Error rate, avg latency, top failures| Variable | Default | Description |
|---|---|---|
AGENT_WATCH_NAME |
script filename | Agent name in the dashboard |
AGENT_WATCH_DB |
~/.agent-watch/traces.db |
Custom database path |
AGENT_WATCH_DISABLED=true |
— | Disable without removing code |
For fine-grained control over traces and spans:
import { createTracer } from 'agent-watch';
const tracer = createTracer({ name: 'my-agent' });
const trace = tracer.startTrace('process-order');
const span = trace.startSpan('validate-input');
span.setAttributes({ orderId: '123' });
span.end('ok');
trace.end();Full SDK docs: SDK API Reference
SDK API Reference
const tracer = createTracer({
name: 'my-agent',
store: 'sqlite',
dbPath: './traces.db',
});// Manual
const trace = tracer.startTrace('task');
trace.end();
// Automatic
await tracer.withTrace('task', async (trace) => { ... });const span = trace.startSpan('llm-call');
span.setAttributes({ model: 'gpt-4', tokens: 150 });
span.end('ok'); // or span.endWithError(error)const openai = tracer.instrument(new OpenAI());
const anthropic = tracer.instrument(new Anthropic());
// All calls are now automatically tracedAll data stays in a local SQLite file. No accounts. No API keys. No data leaves your machine.
MIT License — Copyright (c) 2026 Raul Rosello
git clone https://github.com/rankgnar/agent-watch
cd agent-watch
npm install
npm run buildPRs welcome. Open an issue first for major changes.
