Skip to content

airblackbox/air-platform

AIR Blackbox

AIR Blackbox

Compliance infrastructure for autonomous AI agents

EU AI Act compliant by default. Tamper-evident audit. Runtime enforcement. Every framework.

License Status Go Python

🎯 Live Demo · 📊 Dashboard · 🔀 Compare Runs


Interactive Demos

Explore each core component without installing anything:

Component What It Shows Try It
Gateway OpenAI-compatible proxy that records every LLM call as an OTel trace Launch Demo →
Episode Store Groups raw traces into replayable task-level episodes Launch Demo →
Policy Engine Risk-tiered autonomy, kill switches, and trust scoring Launch Demo →
Platform Full stack in one command — Docker Compose orchestration Launch Demo →
OTel Collector Redaction, cost metrics, and loop detection processor Launch Demo →
Hosted Demo Full platform walkthrough — 4 scenarios with live traces Launch Demo →
Dashboard Browse and inspect recorded episodes from the Episode Store Launch Dashboard →
Compare Runs Diff two agent runs side-by-side — steps, tokens, cost, policy Launch Compare →

The Problem

The EU AI Act enforcement date for high-risk AI systems is August 2, 2026. Companies deploying AI agents — tool-calling LLMs that take actions autonomously — face mandatory requirements around logging, transparency, human oversight, and data governance. Penalties: up to €35M or 7% of global turnover.

Most compliance platforms target CISOs with top-down dashboards. Nobody is giving developers the building blocks to make their agents compliant by default.

What AIR Blackbox Does

AIR Blackbox is the compliance infrastructure layer for AI agents. Drop-in SDKs that make your agent stack EU AI Act compliant — the same way Stripe made payments PCI compliant.

Your Agent → AIR Blackbox → Compliant, Auditable, Enforced
EU AI Act Article Requirement AIR Feature
Art. 9 Risk management ConsentGate — risk classification and blocking policies
Art. 10 Data governance DataVault — PII tokenization before it reaches the LLM
Art. 11 Technical documentation Full call graph audit logging with timestamps
Art. 12 Record-keeping HMAC-SHA256 tamper-evident audit chain
Art. 14 Human oversight Consent-based tool gating with exception blocking
Art. 15 Robustness & security InjectionDetector + multi-layer defense

See the full compliance mapping for article-by-article details.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        YOUR AI AGENTS                            │
│  (OpenAI · LangChain · CrewAI · AutoGen · Any LLM framework)   │
└─────────────────────┬───────────────────────────────────────────┘
                      │ OTLP / HTTP
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                     INSTRUMENTATION LAYER                       │
│                                                                 │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
│  │  Python SDK  │ │ Trust Plugins│ │  Framework Connectors    ││
│  │  (pip)       │ │ (4 frameworks│ │  (CrewAI, LangChain,     ││
│  │              │ │  supported)  │ │   AutoGen, OpenAI Agents)││
│  └──────────────┘ └──────────────┘ └──────────────────────────┘│
└─────────────────────┬───────────────────────────────────────────┘
                      │
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                      CORE RUNTIME                               │
│                                                                 │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
│  │   Gateway    │ │Episode Store │ │    Policy Engine         ││
│  │  (Go proxy)  │ │ (SQLite +    │ │  (risk tiers, kill       ││
│  │              │ │  S3 vault)   │ │   switches, trust score) ││
│  └──────┬───────┘ └──────────────┘ └──────────────────────────┘│
│         │                                                       │
│         ▼                                                       │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐│
│  │ OTel Genai   │ │  Prompt      │ │  Semantic Normalizer     ││
│  │ Processor    │ │  Vault       │ │  (gen_ai.* → standard)   ││
│  │ (redact,     │ │ (encrypted   │ │                          ││
│  │  metrics,    │ │  storage)    │ │                          ││
│  │  loop detect)│ │              │ │                          ││
│  └──────────────┘ └──────────────┘ └──────────────────────────┘│
└─────────────────────┬───────────────────────────────────────────┘
                      │
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                    OBSERVABILITY BACKENDS                        │
│          Jaeger · Prometheus · Grafana · Datadog · Any OTLP     │
└─────────────────────────────────────────────────────────────────┘

Quick Start

Option 1: Full stack (Docker Compose)

git clone https://github.com/airblackbox/air-platform.git
cd air-platform
cp .env.example .env    # add your OPENAI_API_KEY
make up                 # starts Gateway + Episode Store + Policy Engine + Jaeger + Prometheus

Option 2: Python SDK only

pip install air-blackbox-sdk
from air_blackbox import AIRBlackbox

air = AIRBlackbox()
# Wraps your OpenAI client with automatic tracing
client = air.wrap(openai.OpenAI())

Option 3: OTel Collector processor (no code changes)

Add to your existing otelcol-config.yaml:

processors:
  genaisafe:
    redact:
      mode: hash_and_preview
      preview_chars: 48
    metrics:
      enable: true
    loop_detection:
      enable: true
      repeat_threshold: 6

Components

Core Runtime

Repository Description Demo
gateway OpenAI-compatible reverse proxy — records every LLM call as an OpenTelemetry trace View Demo
agent-episode-store Groups raw traces into replayable task-level episodes (SQLite + S3) View Demo
agent-policy-engine Risk-tiered autonomy, kill switches, and trust scoring View Demo
air-platform Docker Compose orchestration — one command to run the full stack View Demo

Instrumentation

Repository Description
python-sdk Python SDK — wraps OpenAI, Anthropic, and other LLM clients
trust-crewai Trust plugin for CrewAI multi-agent framework
trust-langchain Trust plugin for LangChain / LangGraph
trust-autogen Trust plugin for Microsoft AutoGen
trust-openai-agents Trust plugin for OpenAI Agents SDK

Safety & Governance

Repository Description Demo
otel-collector-genai OTel Collector processor — redaction, cost metrics, loop detection View Demo
otel-prompt-vault Encrypted prompt/completion storage with pre-signed URL retrieval
otel-semantic-normalizer Normalizes gen_ai.* and llm.* attributes to a standard schema
agent-tool-sandbox Sandboxed execution environment for agent tool calls
runtime-aibom-emitter Generates AI Bill of Materials at runtime

#Compliance

Component What It Does
air-compliance CLI scanner — checks your project for EU AI Act compliance coverage

Evaluation & Testing

Repository Description
eval-harness CLI tool for replaying and scoring episodes against policies
trace-regression-harness Detects behavioral regressions across agent versions
agent-vcr Record and replay agent interactions for deterministic testing

Security

Repository Description
mcp-security-scanner Scans MCP server configurations for security vulnerabilities
mcp-policy-gateway Policy enforcement gateway for Model Context Protocol

Why Infrastructure-Level Compliance?

Most teams try to add compliance at the application level — inside each agent, each framework, each service. This approach fails because:

  • Every team re-invents audit logging differently (and none are tamper-evident)
  • PII leaks through cracks between implementations
  • No single chain of custody across framework boundaries
  • When regulators ask "prove it" — nobody has mathematically verifiable logs

AIR Blackbox operates at the infrastructure level — as framework-native SDKs, an OTel Collector processor, a reverse proxy, and a policy engine. Three lines of code activates compliance across your entire agent stack.

Threat Model

AIR Blackbox addresses four attack vectors in GenAI observability:

Threat Risk Mitigation
Prompt Data Leakage PII, proprietary data exposed in traces SHA-256 redaction with configurable preview
Secret Exposure API keys, bearer tokens in span attributes Denylist regex patterns, automatic detection
Runaway Loops Infinite tool-calling burning budget Repeat threshold detection, span flagging
Cost Blind Spots No normalized token/cost visibility Unified metrics extraction from any format

License

All AIR Blackbox components are released under the Apache License 2.0.

Contributing

We welcome contributions. See CONTRIBUTING.md for guidelines.


Support the Project

If AIR Blackbox is useful to you, a star helps others find it.

Star on GitHub

Questions or feedback? Start a Discussion.


AIR Blackbox — Agent Infrastructure Runtime
Compliance infrastructure for autonomous AI agents