agent-watch

Observability for AI apps in production. See every LLM call your app makes — model, tokens, latency, errors — in a local dashboard. Zero cloud. Zero config.

Who is this for?

If you're building software that calls AI APIs (OpenAI, Anthropic, OpenRouter, Groq, Ollama, etc.), agent-watch shows you exactly what's happening under the hood:

Backend developers building chatbots, AI assistants, or automated workflows
Startup teams running AI agents in production that need to debug failures fast
Freelancers building AI-powered apps for clients who ask "why did it break?"
Teams with AI pipelines processing documents, analyzing data, or generating content

The problem it solves

Your AI app calls GPT-4 or Claude 50 times per request. One of those calls fails, returns garbage, or takes 10 seconds. Without agent-watch, you dig through logs for hours. With it, you open a dashboard and see the exact call that failed, why, and how long it took.

NOT for

End users chatting with ChatGPT (you don't need this)
Non-technical users (this is a developer tool)

Quick Start

Option 1: Proxy mode (any language — recommended)

Run one command. No code changes needed:

npx agent-watch --target https://openrouter.ai/api/v1

Then point your app to http://localhost:4201 instead of the API URL:

# Python
client = OpenAI(base_url="http://localhost:4201/v1")

# JavaScript
const client = new OpenAI({ baseURL: "http://localhost:4201/v1" });

# curl
curl http://localhost:4201/v1/chat/completions -H "Authorization: Bearer YOUR_KEY" ...

Open http://localhost:4200 → see every call in real time.

Works with any language (Python, Go, Rust, Java, curl) and any provider:

npx agent-watch --target https://api.openai.com        # OpenAI
npx agent-watch --target https://api.anthropic.com      # Anthropic
npx agent-watch --target https://openrouter.ai/api/v1   # OpenRouter
npx agent-watch --target http://localhost:11434          # Ollama (local)
npx agent-watch --target https://api.groq.com/openai    # Groq
npx agent-watch --target https://api.together.xyz       # Together
npx agent-watch --target https://api.mistral.ai         # Mistral

Security: The proxy only listens on 127.0.0.1 (your machine). API keys are forwarded to the provider but never stored locally.

Option 2: One-liner for Node.js

If your app is Node.js, just add one line at the top:

require('agent-watch/auto');

Every OpenAI/Anthropic call is traced automatically. Run the dashboard with:

npx agent-watch serve

What you see in the dashboard

✓ classify-ticket      0.8s   gpt-4    tokens: 150    ok
✓ lookup-customer      0.2s   —        DB query       ok
✗ generate-response    ERROR  gpt-4    rate_limit     error
✓ retry-response       1.2s   gpt-4    tokens: 340    ok
✓ send-email           0.1s   —        SMTP           ok

Click any trace to see the full span tree — every step your agent took, with timing, token counts, and the exact error message when something breaks.

Real-world use cases

🔍 "My AI agent is giving wrong answers"

Open the dashboard, find the trace, see which LLM call returned unexpected output. Fix the prompt, not the whole app.

💰 "How much are we spending on tokens?"

Every trace shows input/output token counts. See which agent or workflow burns the most API budget.

⚡ "The app is slow, but I don't know why"

Latency per span. Instantly see if it's the LLM (2s), the database (50ms), or the tool call (5s timeout).

🔄 "The agent is stuck in a loop"

The span tree makes it obvious — 47 identical calls to the same tool. No more guessing from flat logs.

🛡️ "We need an audit trail"

Regulated industries (finance, healthcare) need records of AI decisions. agent-watch stores the full decision tree with timestamps.

🧪 "Did my prompt change break anything?"

Compare traces before and after. See how token usage, latency, and outputs differ.

CLI

agent-watch --target <url>           # Start proxy + dashboard
agent-watch serve --port 4200        # Dashboard only
agent-watch list                     # Recent traces
agent-watch list --status error      # Only failures
agent-watch replay <trace-id>        # Full span tree
agent-watch stats                    # Error rate, avg latency, top failures

Configuration (optional)

Variable	Default	Description
`AGENT_WATCH_NAME`	script filename	Agent name in the dashboard
`AGENT_WATCH_DB`	`~/.agent-watch/traces.db`	Custom database path
`AGENT_WATCH_DISABLED=true`	—	Disable without removing code

Advanced: Manual SDK

For fine-grained control over traces and spans:

import { createTracer } from 'agent-watch';

const tracer = createTracer({ name: 'my-agent' });
const trace = tracer.startTrace('process-order');

const span = trace.startSpan('validate-input');
span.setAttributes({ orderId: '123' });
span.end('ok');

trace.end();

Full SDK docs: SDK API Reference

SDK API Reference

`createTracer(config)`

const tracer = createTracer({
  name: 'my-agent',
  store: 'sqlite',
  dbPath: './traces.db',
});

`tracer.startTrace(name)` / `tracer.withTrace(name, fn)`

// Manual
const trace = tracer.startTrace('task');
trace.end();

// Automatic
await tracer.withTrace('task', async (trace) => { ... });

`trace.startSpan(name)`

const span = trace.startSpan('llm-call');
span.setAttributes({ model: 'gpt-4', tokens: 150 });
span.end('ok');          // or span.endWithError(error)

`tracer.instrument(client)`

const openai = tracer.instrument(new OpenAI());
const anthropic = tracer.instrument(new Anthropic());
// All calls are now automatically traced

Self-hosted. Zero cloud. MIT.

All data stays in a local SQLite file. No accounts. No API keys. No data leaves your machine.

MIT License — Copyright (c) 2026 Raul Rosello

Contributing

git clone https://github.com/rankgnar/agent-watch
cd agent-watch
npm install
npm run build

PRs welcome. Open an issue first for major changes.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
assets		assets
examples		examples
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-watch

Who is this for?

The problem it solves

NOT for

Quick Start

Option 1: Proxy mode (any language — recommended)

Option 2: One-liner for Node.js

What you see in the dashboard

Real-world use cases

🔍 "My AI agent is giving wrong answers"

💰 "How much are we spending on tokens?"

⚡ "The app is slow, but I don't know why"

🔄 "The agent is stuck in a loop"

🛡️ "We need an audit trail"

🧪 "Did my prompt change break anything?"

CLI

Configuration (optional)

Advanced: Manual SDK

`createTracer(config)`

`tracer.startTrace(name)` / `tracer.withTrace(name, fn)`

`trace.startSpan(name)`

`tracer.instrument(client)`

Self-hosted. Zero cloud. MIT.

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-watch

Who is this for?

The problem it solves

NOT for

Quick Start

Option 1: Proxy mode (any language — recommended)

Option 2: One-liner for Node.js

What you see in the dashboard

Real-world use cases

🔍 "My AI agent is giving wrong answers"

💰 "How much are we spending on tokens?"

⚡ "The app is slow, but I don't know why"

🔄 "The agent is stuck in a loop"

🛡️ "We need an audit trail"

🧪 "Did my prompt change break anything?"

CLI

Configuration (optional)

Advanced: Manual SDK

createTracer(config)

tracer.startTrace(name) / tracer.withTrace(name, fn)

trace.startSpan(name)

tracer.instrument(client)

Self-hosted. Zero cloud. MIT.

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`createTracer(config)`

`tracer.startTrace(name)` / `tracer.withTrace(name, fn)`

`trace.startSpan(name)`

`tracer.instrument(client)`

Packages