EU AI Act compliant by default. Tamper-evident audit. Runtime enforcement. Every framework.
🎯 Live Demo · 📊 Dashboard · 🔀 Compare Runs
Explore each core component without installing anything:
| Component | What It Shows | Try It |
|---|---|---|
| Gateway | OpenAI-compatible proxy that records every LLM call as an OTel trace | Launch Demo → |
| Episode Store | Groups raw traces into replayable task-level episodes | Launch Demo → |
| Policy Engine | Risk-tiered autonomy, kill switches, and trust scoring | Launch Demo → |
| Platform | Full stack in one command — Docker Compose orchestration | Launch Demo → |
| OTel Collector | Redaction, cost metrics, and loop detection processor | Launch Demo → |
| Hosted Demo | Full platform walkthrough — 4 scenarios with live traces | Launch Demo → |
| Dashboard | Browse and inspect recorded episodes from the Episode Store | Launch Dashboard → |
| Compare Runs | Diff two agent runs side-by-side — steps, tokens, cost, policy | Launch Compare → |
The EU AI Act enforcement date for high-risk AI systems is August 2, 2026. Companies deploying AI agents — tool-calling LLMs that take actions autonomously — face mandatory requirements around logging, transparency, human oversight, and data governance. Penalties: up to €35M or 7% of global turnover.
Most compliance platforms target CISOs with top-down dashboards. Nobody is giving developers the building blocks to make their agents compliant by default.
AIR Blackbox is the compliance infrastructure layer for AI agents. Drop-in SDKs that make your agent stack EU AI Act compliant — the same way Stripe made payments PCI compliant.
Your Agent → AIR Blackbox → Compliant, Auditable, Enforced
| EU AI Act Article | Requirement | AIR Feature |
|---|---|---|
| Art. 9 | Risk management | ConsentGate — risk classification and blocking policies |
| Art. 10 | Data governance | DataVault — PII tokenization before it reaches the LLM |
| Art. 11 | Technical documentation | Full call graph audit logging with timestamps |
| Art. 12 | Record-keeping | HMAC-SHA256 tamper-evident audit chain |
| Art. 14 | Human oversight | Consent-based tool gating with exception blocking |
| Art. 15 | Robustness & security | InjectionDetector + multi-layer defense |
See the full compliance mapping for article-by-article details.
┌─────────────────────────────────────────────────────────────────┐
│ YOUR AI AGENTS │
│ (OpenAI · LangChain · CrewAI · AutoGen · Any LLM framework) │
└─────────────────────┬───────────────────────────────────────────┘
│ OTLP / HTTP
▼
┌─────────────────────────────────────────────────────────────────┐
│ INSTRUMENTATION LAYER │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
│ │ Python SDK │ │ Trust Plugins│ │ Framework Connectors ││
│ │ (pip) │ │ (4 frameworks│ │ (CrewAI, LangChain, ││
│ │ │ │ supported) │ │ AutoGen, OpenAI Agents)││
│ └──────────────┘ └──────────────┘ └──────────────────────────┘│
└─────────────────────┬───────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ CORE RUNTIME │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
│ │ Gateway │ │Episode Store │ │ Policy Engine ││
│ │ (Go proxy) │ │ (SQLite + │ │ (risk tiers, kill ││
│ │ │ │ S3 vault) │ │ switches, trust score) ││
│ └──────┬───────┘ └──────────────┘ └──────────────────────────┘│
│ │ │
│ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
│ │ OTel Genai │ │ Prompt │ │ Semantic Normalizer ││
│ │ Processor │ │ Vault │ │ (gen_ai.* → standard) ││
│ │ (redact, │ │ (encrypted │ │ ││
│ │ metrics, │ │ storage) │ │ ││
│ │ loop detect)│ │ │ │ ││
│ └──────────────┘ └──────────────┘ └──────────────────────────┘│
└─────────────────────┬───────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ OBSERVABILITY BACKENDS │
│ Jaeger · Prometheus · Grafana · Datadog · Any OTLP │
└─────────────────────────────────────────────────────────────────┘
Option 1: Full stack (Docker Compose)
git clone https://github.com/airblackbox/air-platform.git
cd air-platform
cp .env.example .env # add your OPENAI_API_KEY
make up # starts Gateway + Episode Store + Policy Engine + Jaeger + PrometheusOption 2: Python SDK only
pip install air-blackbox-sdkfrom air_blackbox import AIRBlackbox
air = AIRBlackbox()
# Wraps your OpenAI client with automatic tracing
client = air.wrap(openai.OpenAI())Option 3: OTel Collector processor (no code changes)
Add to your existing otelcol-config.yaml:
processors:
genaisafe:
redact:
mode: hash_and_preview
preview_chars: 48
metrics:
enable: true
loop_detection:
enable: true
repeat_threshold: 6| Repository | Description | Demo |
|---|---|---|
| gateway | OpenAI-compatible reverse proxy — records every LLM call as an OpenTelemetry trace | View Demo |
| agent-episode-store | Groups raw traces into replayable task-level episodes (SQLite + S3) | View Demo |
| agent-policy-engine | Risk-tiered autonomy, kill switches, and trust scoring | View Demo |
| air-platform | Docker Compose orchestration — one command to run the full stack | View Demo |
| Repository | Description |
|---|---|
| python-sdk | Python SDK — wraps OpenAI, Anthropic, and other LLM clients |
| trust-crewai | Trust plugin for CrewAI multi-agent framework |
| trust-langchain | Trust plugin for LangChain / LangGraph |
| trust-autogen | Trust plugin for Microsoft AutoGen |
| trust-openai-agents | Trust plugin for OpenAI Agents SDK |
| Repository | Description | Demo |
|---|---|---|
| otel-collector-genai | OTel Collector processor — redaction, cost metrics, loop detection | View Demo |
| otel-prompt-vault | Encrypted prompt/completion storage with pre-signed URL retrieval | — |
| otel-semantic-normalizer | Normalizes gen_ai.* and llm.* attributes to a standard schema | — |
| agent-tool-sandbox | Sandboxed execution environment for agent tool calls | — |
| runtime-aibom-emitter | Generates AI Bill of Materials at runtime | — |
#Compliance
| Component | What It Does |
|---|---|
| air-compliance | CLI scanner — checks your project for EU AI Act compliance coverage |
| Repository | Description |
|---|---|
| eval-harness | CLI tool for replaying and scoring episodes against policies |
| trace-regression-harness | Detects behavioral regressions across agent versions |
| agent-vcr | Record and replay agent interactions for deterministic testing |
| Repository | Description |
|---|---|
| mcp-security-scanner | Scans MCP server configurations for security vulnerabilities |
| mcp-policy-gateway | Policy enforcement gateway for Model Context Protocol |
Most teams try to add compliance at the application level — inside each agent, each framework, each service. This approach fails because:
- Every team re-invents audit logging differently (and none are tamper-evident)
- PII leaks through cracks between implementations
- No single chain of custody across framework boundaries
- When regulators ask "prove it" — nobody has mathematically verifiable logs
AIR Blackbox operates at the infrastructure level — as framework-native SDKs, an OTel Collector processor, a reverse proxy, and a policy engine. Three lines of code activates compliance across your entire agent stack.
AIR Blackbox addresses four attack vectors in GenAI observability:
| Threat | Risk | Mitigation |
|---|---|---|
| Prompt Data Leakage | PII, proprietary data exposed in traces | SHA-256 redaction with configurable preview |
| Secret Exposure | API keys, bearer tokens in span attributes | Denylist regex patterns, automatic detection |
| Runaway Loops | Infinite tool-calling burning budget | Repeat threshold detection, span flagging |
| Cost Blind Spots | No normalized token/cost visibility | Unified metrics extraction from any format |
All AIR Blackbox components are released under the Apache License 2.0.
We welcome contributions. See CONTRIBUTING.md for guidelines.
If AIR Blackbox is useful to you, a star helps others find it.
Questions or feedback? Start a Discussion.
AIR Blackbox — Agent Infrastructure Runtime
Compliance infrastructure for autonomous AI agents
