Skip to content

builderz-labs/agent-run

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agent-run

Open standard for agent observability.

One schema for how AI agents report work — regardless of runtime, model, or provider.

Why

Code is commoditizing. The durable value in AI agent infrastructure is data, provenance, protocols, and evals. This spec is the protocol layer.

agent-run defines a standard AgentRun object that any agent runtime can emit and any dashboard can consume. It answers: what ran, how it ran, what it cost, and whether it worked.

Core Objects

Object Purpose
AgentRun A single agent execution — the atomic unit of observability
Step One action within a run (reasoning, tool call, error, handoff)
Cost Token usage and dollar cost attribution
Provenance Cryptographic proof of how an output was produced
EvalResult Scoring and quality assessment of a run

Quick Start

TypeScript

pnpm add @agent-run/types
import type { AgentRun, Step, Provenance } from '@agent-run/types';

const run: AgentRun = {
  id: crypto.randomUUID(),
  agent_id: 'my-agent',
  status: 'completed',
  started_at: new Date().toISOString(),
  steps: [],
  cost: { input_tokens: 1500, output_tokens: 800, cost_usd: 0.012 },
  provenance: { run_hash: '...' }
};

// Report to any agent-run compliant server
await fetch('http://localhost:3000/api/v1/runs', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'X-API-Key': key },
  body: JSON.stringify(run)
});

Rust

[dependencies]
agent-run-types = "0.1"
use agent_run_types::AgentRun;

Raw JSON Schema

Schemas are in schemas/. Validate any JSON against them:

npx ajv validate -s schemas/agent-run.json -r 'schemas/*.json' -d my-run.json

Schema Files

schemas/
  agent-run.json    # Root object — the run itself
  step.json         # Individual steps within a run
  cost.json         # Token usage and cost
  provenance.json   # Cryptographic audit trail
  eval.json         # Evaluation/scoring results
  openapi.yaml      # Full API spec for compliant servers

Provenance

Every run gets a run_hash — a SHA-256 of the canonical inputs (agent_id, model, tools, config, trigger). Runs triggered by other runs form a hash chain via lineage. This creates a verifiable audit trail: given the same inputs, anyone can reproduce the hash.

Optional Ed25519 signatures (signed_by + signature) enable tamper detection for enterprise use.

Evals

Runs can be scored after completion. The EvalResult tracks:

  • pass/fail against acceptance criteria
  • score (0-100) for nuanced grading
  • metrics: cost, duration, tool calls, retries, convergence
  • regression detection via regression_from linking

Benchmark packs in bench/ provide reproducible evaluation scenarios.

Compliant Servers

Server Status
Mission Control Reference implementation

Want to add yours? Open a PR.

Compliant Runtimes

Runtime Status
Mission Control (spawned agents) Built-in
OneClaw Planned
Claude Code (via MC MCP) Via adapter

Building an agent runtime? Emit AgentRun objects to any compliant server.

Design Principles

  1. Schema-first — JSON Schema is the source of truth. Types are generated.
  2. Privacy-preservinginput_preview and output_preview are truncated. Full prompts/outputs are never required by the spec.
  3. Runtime-agnostic — Works with any agent framework, model, or provider.
  4. Incrementally adoptable — Only id, agent_id, status, started_at, steps, cost, and provenance are required. Everything else is optional.
  5. Extensiblemetadata fields on every object for runtime-specific data.

License

MIT

About

Open standard for agent observability. One schema for how AI agents report work.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages