High-Performance, Real-time Safety and Compliance Layer for Streaming LLMs
CheckStream is a production-ready Rust guardrail platform that enforces safety, security, and regulatory compliance on LLM outputs as tokens stream—with sub-10ms latency. Works with any LLM provider.
Version: 0.1.0 Status: Core Complete - Production Ready for Testing
| Component | Status | Details |
|---|---|---|
| Three-Phase Proxy | Complete | Ingress, Midstream, Egress pipelines |
| ML Classifiers | Working | DistilBERT sentiment from HuggingFace |
| Pattern Classifiers | Complete | PII, prompt injection, custom patterns |
| Policy Engine | Complete | Triggers, actions, composite rules |
| Action Executor | Complete | Stop, Redact, Log, Audit actions |
| Audit Trail | Complete | Hash-chained, tamper-proof logging |
| Telemetry | Complete | Prometheus metrics, structured logging |
| Security Hardening | Complete | SSRF protection, timing-safe auth, security headers |
| Tests | 122 passing | Unit, integration, ML classifier tests |
# Clone and build
git clone https://github.com/Skelf-Research/checkstream.git
cd checkstream
cargo build --release --features ml-models
# Run the proxy
./target/release/checkstream-proxy \
--backend https://api.openai.com/v1 \
--policy ./policies/default.yaml \
--port 8080# Run the sentiment classifier example
cargo run --example test_hf_model --features ml-models
# Output:
# Model loaded successfully!
# "I love this movie!" → positive (1.000)
# "This is terrible." → negative (1.000)cargo test --workspace # 122 tests pass┌─────────────────────────────────────────────────────────────┐
│ Your Application │
│ (OpenAI SDK, Anthropic SDK, etc.) │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ CheckStream Proxy │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Phase 1 │ │ Phase 2 │ │ Phase 3 │ │
│ │ INGRESS │→ │ MIDSTREAM │→ │ EGRESS │ │
│ │ Validate │ │ Stream │ │ Compliance │ │
│ │ Prompt │ │ Checks │ │ & Audit │ │
│ │ (~3ms) │ │ (~2ms/chunk) │ │ (async) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Classifier Pipeline │ │
│ │ Pattern (Tier A) → ML Models (Tier B) → Policy │ │
│ └──────────────────────────────────────────────────────┘ │
└──────────────────────────┬──────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌────────┐ ┌─────────┐ ┌─────────┐
│ OpenAI │ │ Claude │ │ vLLM │
└────────┘ └─────────┘ └─────────┘
Load models from HuggingFace with zero code:
# models/registry.yaml
models:
sentiment:
source:
type: huggingface
repo: "distilbert-base-uncased-finetuned-sst-2-english"
architecture:
type: distil-bert-sequence-classification
num_labels: 2
labels: ["negative", "positive"]
inference:
device: "cpu" # or "cuda" for GPU
max_length: 512Performance (CPU):
- DistilBERT: ~30-50ms per inference
- GPU (estimated): 2-10ms per inference
Define rules with triggers and actions:
# policies/default.yaml
name: safety-policy
rules:
- name: block-injection
trigger:
type: pattern
pattern: "ignore previous instructions"
case_insensitive: true
actions:
- type: stop
message: "Request blocked"
status_code: 403
- name: toxicity-check
trigger:
type: classifier
classifier: toxicity
threshold: 0.8
actions:
- type: audit
category: safety
severity: high| Phase | Purpose | Latency |
|---|---|---|
| Ingress | Validate prompts before LLM | ~3ms |
| Midstream | Check streaming chunks | ~2ms/chunk |
| Egress | Final compliance check | async |
GET /health # Basic health check
GET /health/live # Kubernetes liveness probe
GET /health/ready # Kubernetes readiness probe
GET /metrics # Prometheus metrics (requires admin key)
GET /audit # Query audit trail (requires admin key)CheckStream is built with security as a core principle:
| Feature | Description |
|---|---|
| SSRF Protection | Backend URLs validated; internal IPs and cloud metadata endpoints blocked |
| Timing-Safe Auth | Constant-time comparison prevents API key extraction via timing attacks |
| Request Limits | 10MB body size limit prevents memory exhaustion |
| Security Headers | X-Content-Type-Options, X-Frame-Options, CSP on all responses |
| Secure IDs | Cryptographic UUID v4 for request/event IDs (unpredictable) |
| Config Limits | 1MB YAML file limit prevents billion-laughs attacks |
| Memory Safety | Written in Rust - no buffer overflows or use-after-free |
Protect admin endpoints with an API key:
export CHECKSTREAM_ADMIN_API_KEY="$(openssl rand -hex 32)"
# Access protected endpoints
curl -H "X-Checkstream-Admin-Key: $CHECKSTREAM_ADMIN_API_KEY" \
http://localhost:8080/metricsFor local development with localhost backends:
export CHECKSTREAM_DEV_MODE=1 # Never use in production!See Security & Privacy for complete security documentation.
checkstream/
├── crates/
│ ├── checkstream-core/ # Types, errors, token buffer
│ ├── checkstream-classifiers/ # ML models, patterns, pipeline
│ ├── checkstream-policy/ # Policy engine, triggers, actions
│ ├── checkstream-proxy/ # HTTP proxy server
│ └── checkstream-telemetry/ # Audit trail, metrics
├── examples/
│ ├── test_hf_model.rs # Live ML model demo
│ └── full_dynamic_pipeline.rs # Complete pipeline example
├── policies/ # Policy YAML files
├── models/ # Model registry configs
└── docs/ # Documentation
| Document | Description |
|---|---|
| Architecture | Technical design |
| Getting Started | Setup guide |
| Model Loading | ML model configuration |
| Pipeline Configuration | Classifier pipelines |
| Policy Engine | Policy-as-code reference |
| API Reference | REST API docs |
| FCA Example | Financial compliance example |
| Deployment Modes | Proxy vs Sidecar |
| Security & Privacy | Data handling |
| Regulatory Compliance | FCA, FINRA, GDPR |
- Financial Services: FCA Consumer Duty compliance, advice boundary detection
- Healthcare: HIPAA compliance, medical disclaimer injection
- Security: Prompt injection defense, PII protection, data exfiltration prevention
- Content Moderation: Real-time toxicity filtering
| Component | Target | Actual |
|---|---|---|
| Pattern classifier | <2ms | ~0.5ms |
| ML classifier (CPU) | <50ms | ~30-50ms |
| ML classifier (GPU) | <10ms | ~2-10ms (est.) |
| Policy evaluation | <1ms | ~0.2ms |
| Total overhead | <10ms | ~5-8ms (patterns only) |
See CONTRIBUTING.md for guidelines.
Apache 2.0 - See LICENSE
- Documentation: docs/
- Issues: GitHub Issues
Built for trust at the speed of generation.