a2a-sentinel is built on a principle: Security ON by Default.
Every security feature is enabled out of the box. Disabling any protection requires explicit configuration. This guide explains the threat model, authentication modes, rate limiting strategy, and how to configure sentinel for your security requirements.
- Security Philosophy
- Threat Model & Defenses
- Authentication Modes
- Rate Limiting (2-Layer)
- Policy Engine (ABAC)
- Agent Card Security
- Audit Logging
- Push Notification Protection
- Replay Attack Prevention
- Trusted Proxies
- Error Messages & Hints
- Configuration Reference
- Reporting Vulnerabilities
a2a-sentinel follows a defense-in-depth approach with sensible defaults:
- Explicit over implicit: All security decisions are explicit. Logging every decision (allow/block).
- Educational errors: Every security block includes a
hintexplaining what went wrong and adocs_urlpointing to the fix. - Observable: All decisions logged in OTel-compatible structured format for audit trails.
- Gateway responsibility: sentinel protects your agents. Agents don't need to validate sentinel-specific requirements.
| Level | Use Case | Config Profile |
|---|---|---|
| Development | Local testing, no auth needed | sentinel init --profile dev |
| Strict Development | Team testing, auth headers required but not validated | sentinel init --profile strict-dev |
| Production | Full JWT validation, aggressive rate limiting | sentinel init --profile prod |
This table maps real-world threats against a2a-sentinel defenses:
| # | Threat | Attack Vector | Sentinel Defense | Configuration |
|---|---|---|---|---|
| 1 | Unauthorized access | Missing or forged authentication tokens | 2-layer authentication (passthrough-strict default) | security.auth.mode |
| 2 | DoS/DDoS | Request flooding from single IP | Per-IP rate limiting (pre-auth) | security.rate_limit.ip.per_ip, listen.global_rate_limit |
| 3 | User abuse | Single authenticated user hammering the gateway | Per-user rate limiting (post-auth) | security.rate_limit.user.per_user |
| 4 | Agent Card poisoning | Attacker modifies agent card in transit | Change detection + alert logging | agents[].card_change_policy |
| 5 | Cache poisoning | Attacker injects malicious card during polling | JWS signature verification | security.card_signature.require |
| 6 | SSRF via push notifications | Attacker tricks gateway into accessing private network | URL validation, private IP blocking, HTTPS enforcement | security.push.block_private_networks |
| 7 | Replay attacks | Attacker replays old requests to trigger actions | Nonce + timestamp validation (warn/require policies) | security.replay.enabled |
| 8 | Man-in-middle | Unencrypted communication with agents | TLS enforcement by default | agents[].allow_insecure: false |
| 9 | Resource exhaustion | Too many concurrent SSE streams per agent | Per-agent stream limit | agents[].max_streams |
| 10 | Connection exhaustion | Too many total gateway connections | Global connection limit | listen.max_connections |
| 11 | Unauthorized agent access | User accesses restricted agents or methods | ABAC policy engine with attribute-based rules | security.policies[] |
| 12 | Off-hours exploitation | Attacks during unmonitored periods | Time-based policy restrictions | security.policies[].conditions.time |
a2a-sentinel supports four authentication modes, controlled by security.auth.mode. Choose one:
Behavior: Accept requests with or without Authorization headers. No validation.
Use case: Local development before agents are ready.
Config:
security:
auth:
mode: passthroughRisk: Offers zero protection. Only safe on localhost.
Behavior: Require Authorization header, but don't validate the token. Extract and log the subject claim (with "unverified:" prefix).
Use case: Team development, docker-compose testing, strict header enforcement without JWT overhead.
Config:
security:
auth:
mode: passthrough-strict
allow_unauthenticated: false # Require headerHow it works:
- Request arrives without Authorization header → rejected with 401
- Request with Authorization header → accepted, subject extracted from token (if JWT) or truncated (if opaque)
- Subject logged as
unverified:<subject>(marks unvalidated origin)
Example audit log:
{
"timestamp": "2025-02-26T12:34:56Z",
"a2a.auth.subject": "unverified:user-123"
}Behavior: Full JWT validation — issuer, audience, expiry, JWKS signature verification.
Use case: Production with OAuth2/OIDC token providers.
Config:
security:
auth:
mode: jwt
allow_unauthenticated: false
schemes:
- type: bearer
jwt:
issuer: https://auth.example.com
audience: sentinel-api
jwks_url: https://auth.example.com/.well-known/jwks.jsonValidation:
- Token format:
Authorization: Bearer <JWT> - Signature verified against JWKS endpoint
- Claims validated:
iss,aud,exp - Subject (
subclaim) extracted and logged as verified
Example JWT token:
eyJhbGciOiJSUzI1NiIsImtpZCI6ImtleTEifQ.
eyJzdWIiOiJ1c2VyLTEyMyIsImlzcyI6Imh0dHBzOi8vYXV0aC5leGFtcGxlLmNvbSIsImF1ZCI6InNlbnRpbmVsLWFwaSIsImV4cCI6MTcwODk5OTAwMH0.
<signature>
Error cases:
- Missing Authorization header → 401 (if not allowed)
- Invalid signature → 401
- Expired token → 401
- Wrong issuer → 401
- Wrong audience → 401
Behavior: Simple shared secret in Authorization header.
Use case: Simple deployments, internal APIs with limited clients.
Config:
security:
auth:
mode: api-key
allow_unauthenticated: false
schemes:
- type: bearer
api_key:
secret: sk_abc123xyz # Keep in environment variable!How it works:
- Client sends:
Authorization: Bearer sk_abc123xyz - Gateway compares against configured secret
- If match → allow, log subject as "api-key-user"
- If mismatch → 401
Best practice: Store secret in environment variable:
export SENTINEL_API_KEY="sk_$(openssl rand -hex 16)"
./sentinel serve --config sentinel.yamla2a-sentinel enforces rate limits in two strategic places:
Request arrives
↓
Layer 1: Global rate limit (saves CPU on invalid traffic)
↓
Layer 2: Per-IP rate limit (defense against distributed attacks)
↓
Authentication & routing
↓
Layer 3: Per-user rate limit (defense against authenticated abuse)
↓
Request forwarded to agent
What: Gateway-wide token bucket. All traffic shares one limit.
Default: 5,000 requests/minute (83 req/sec)
Use case: Prevent gateway overload. First line of defense against any DDoS.
Config:
listen:
global_rate_limit: 5000 # req/minBehavior:
- Request arrives → check global token bucket
- Token available → increment counter, allow request
- No token → reject with 503 (ErrGlobalLimitReached)
Error response:
{
"error": {
"code": 503,
"message": "Gateway capacity reached",
"hint": "Gateway is at maximum connections. Try again shortly",
"docs_url": "https://a2a-sentinel.dev/docs/limits"
}
}What: Separate token bucket per client IP. Defense against single-IP attacks.
Default: 200 requests/minute per IP, burst of 50
Use case: Fair use across many clients. Prevent one bad actor from hogging gateway.
Config:
security:
rate_limit:
enabled: true
ip:
per_ip: 200 # req/min per IP
burst: 50 # allow burst up to 50
cleanup_interval: 5m # remove inactive IPs after 5minHow IP extraction works:
a2a-sentinel respects the X-Forwarded-For header when behind a trusted proxy:
listen:
trusted_proxies:
- "10.0.0.0/8" # Trust nginx/reverse proxy on private network
- "203.0.113.5" # Trust specific proxy IPAlgorithm (TrustedClientIP):
- If trusted_proxies is empty → use RemoteAddr only (safest default)
- If trusted_proxies set → parse X-Forwarded-For from right to left
- Return rightmost IP that is NOT in trusted_proxies (the actual client)
Example:
- RemoteAddr:
10.0.0.1(reverse proxy) - X-Forwarded-For:
203.0.113.99, 10.0.0.1(attacker, proxy) - trusted_proxies:
["10.0.0.0/8"] - Extracted IP:
203.0.113.99(the actual attacker)
Without proper trusted_proxies, attackers can spoof X-Forwarded-For to bypass limits.
What: Separate token bucket per authenticated user (subject). Defense against authenticated abuse.
Default: 100 requests/minute per user, burst of 20
Applies to: Only authenticated requests (passthrough-strict with subject, jwt, api-key)
Config:
security:
rate_limit:
enabled: true
user:
per_user: 100 # req/min per user
burst: 20 # allow burst up to 20
cleanup_interval: 5m # remove inactive users after 5minUser identification:
- JWT mode: Uses
sub(subject) claim - passthrough-strict: Uses extracted subject (prefixed with "unverified:")
- api-key mode: Uses "api-key-user"
- Unauthenticated: Skips per-user limit (falls back to per-IP only)
Cleanup mechanism:
To prevent unbounded memory growth, inactive user entries are removed after cleanup_interval. Last activity timestamp updated on every request.
Both IP and user limits return 429 (Too Many Requests):
{
"error": {
"code": 429,
"message": "Rate limit exceeded",
"hint": "Wait before retrying. Configure security.rate_limit in sentinel.yaml",
"docs_url": "https://a2a-sentinel.dev/docs/rate-limit"
}
}Audit log entry (with sampling):
{
"timestamp": "2025-02-26T12:34:56Z",
"a2a.status": "blocked",
"a2a.block_reason": "rate_limit_exceeded",
"rate_limit_state": {
"user_remaining": 0,
"user_reset_secs": 30
}
}a2a-sentinel includes an attribute-based access control (ABAC) policy engine that evaluates rules after authentication. Policies provide fine-grained control over who can access which agents, methods, and resources, and when.
The PolicyGuard middleware sits in the security pipeline after authentication and user rate limiting. For each request, it:
- Collects request attributes (source IP, authenticated user, target agent, A2A method, current time, HTTP headers)
- Evaluates all matching policy rules in priority order (lowest number = highest priority)
- First matching rule determines the outcome (allow or deny)
- If no rule matches, the request is allowed (default-allow)
Each policy rule has:
- name: Human-readable identifier
- priority: Evaluation order (lower = evaluated first)
- effect:
allowordeny - conditions: Attribute matchers (all conditions in a rule must match for the rule to apply)
Block requests from specific IP ranges. Supports CIDR notation and negation.
security:
policies:
# Block all traffic from a known-bad network
- name: block-bad-network
priority: 10
effect: deny
conditions:
source_ip:
cidr: ["203.0.113.0/24", "198.51.100.0/24"]
# Allow only corporate network, deny everything else
- name: allow-corporate-only
priority: 20
effect: deny
conditions:
source_ip:
not_cidr: ["10.0.0.0/8", "172.16.0.0/12"]CIDR negation: Use not_cidr to match requests that are NOT from the specified ranges. This is useful for "allow only these networks" patterns.
Restrict access to specific time windows. Useful for business-hours-only policies or maintenance windows.
security:
policies:
# Deny access outside business hours (Eastern Time)
- name: business-hours-only
priority: 20
effect: deny
conditions:
time:
outside: "09:00-17:00"
timezone: "America/New_York"
# Deny access during maintenance window (UTC)
- name: maintenance-window
priority: 5
effect: deny
conditions:
time:
within: "02:00-04:00"
timezone: "UTC"
days: ["Saturday"]Time conditions:
within: Match requests during this time rangeoutside: Match requests outside this time rangetimezone: IANA timezone (default "UTC")days: Optional day-of-week filter (Monday, Tuesday, etc.)
Restrict which users or IPs can access specific agents.
security:
policies:
# Only admins can access the internal-agent
- name: restrict-internal-agent
priority: 30
effect: deny
conditions:
agent: ["internal-agent"]
user_not: ["admin@example.com", "ops@example.com"]
# Block external IPs from accessing sensitive agent
- name: sensitive-agent-internal-only
priority: 25
effect: deny
conditions:
agent: ["sensitive-agent"]
source_ip:
not_cidr: ["10.0.0.0/8"]Control access based on authenticated user identity.
security:
policies:
# Block a specific user
- name: block-suspended-user
priority: 10
effect: deny
conditions:
user: ["suspended-user@example.com"]
# Allow only specific users to use expensive methods
- name: restrict-expensive-methods
priority: 30
effect: deny
conditions:
method: ["tasks/pushNotification/set"]
user_not: ["premium-user@example.com", "admin@example.com"]Restrict specific A2A methods.
security:
policies:
# Disable push notifications entirely
- name: disable-push
priority: 15
effect: deny
conditions:
method: ["tasks/pushNotification/set", "tasks/pushNotification/get"]
# Read-only mode: only allow message/send, block task management
- name: read-only-mode
priority: 20
effect: deny
conditions:
method: ["tasks/cancel", "tasks/delete"]Match requests based on HTTP header values.
security:
policies:
# Block requests without a specific custom header
- name: require-team-header
priority: 25
effect: deny
conditions:
header_missing: ["X-Team-ID"]
# Block requests from a specific client version
- name: block-old-client
priority: 20
effect: deny
conditions:
header:
User-Agent: ["OldClient/1.0*"]Rules are evaluated in priority order (lowest number first). The first matching rule determines the outcome:
Request arrives after authentication
↓
Sort policies by priority (ascending)
↓
For each policy:
↓
Check all conditions against request attributes
↓
All conditions match?
YES → Apply effect (allow/deny), STOP evaluation
NO → Continue to next policy
↓
No policy matched → DEFAULT ALLOW
Example evaluation:
policies:
- name: allow-admin # priority: 10
priority: 10
effect: allow
conditions:
user: ["admin@example.com"]
- name: block-bad-ip # priority: 20
priority: 20
effect: deny
conditions:
source_ip:
cidr: ["203.0.113.0/24"]
- name: business-hours # priority: 30
priority: 30
effect: deny
conditions:
time:
outside: "09:00-17:00"For a request from admin@example.com at 2 AM from IP 203.0.113.50:
- Check
allow-admin(priority 10): user matches → ALLOW (stops here)
For a request from user@example.com at 2 AM from IP 203.0.113.50:
- Check
allow-admin(priority 10): user does not match → skip - Check
block-bad-ip(priority 20): IP matches CIDR → DENY (stops here)
When a request is denied by a policy rule:
{
"error": {
"code": 403,
"message": "Request denied by policy",
"hint": "Policy 'business-hours-only' denied this request. Contact admin for access",
"docs_url": "https://a2a-sentinel.dev/docs/policies"
}
}The hint includes the policy name to help administrators identify which rule triggered the block.
Policy rules are hot-reloadable. When the configuration is reloaded (via SIGHUP or file watch), policy rules are atomically swapped without dropping any in-flight requests.
# Edit sentinel.yaml to update policies, then:
kill -HUP $(pidof sentinel)
# Or use MCP tool:
MCP tool: reload_configChanges take effect immediately. No restart required.
Two MCP tools are available for policy management:
list_policies — List all configured policies with their priority, effect, and conditions:
MCP tool: list_policies
evaluate_policy — Test policies against a simulated request context:
MCP tool: evaluate_policy {
"source_ip": "203.0.113.50",
"user": "test@example.com",
"agent": "echo",
"method": "message/send"
}
Returns which policy would match and whether the request would be allowed or denied.
Agent Cards describe the agent's capabilities, security schemes, and methods. a2a-sentinel periodically fetches and caches them, with multiple safeguards:
What: Gateway fetches /.well-known/agent.json from each agent at regular intervals.
Default interval: 60 seconds per agent
Config:
agents:
- name: my-agent
url: https://agent.example.com
card_path: /.well-known/agent.json
poll_interval: 60s # Fetch every 60 seconds
timeout: 30s # 30s timeout on fetch
allow_insecure: false # Require HTTPS (default)Security measures:
- Body size limit: 1 MB (prevents DoS via huge card)
- Timeout: Configurable (default 30s, prevents hanging)
- TLS enforcement: HTTPS required by default (set
allow_insecure: trueonly for dev)
Error handling:
- Network error → log warning, mark agent unhealthy, keep cached card
- Invalid JSON → log warning, mark unhealthy, keep cached card
- HTTP error (non-200) → log warning, mark unhealthy
What: When a new card is fetched, sentinel compares it against the cached version and detects changes.
Critical changes (marked for alert):
- URL changed
- Version changed
- Security schemes added/removed
- Skills count changed >50%
Non-critical changes:
- Name, description changed
- Capabilities changed (streaming, push, history)
Use case: Detect cache poisoning or unauthorized card updates.
Example:
Old card: version "1.0", 5 skills
New card: version "1.1", 10 skills
Detected: critical=true (>50% skills change) + non-critical (version change)
Default: alert (prevent changes from taking effect)
Behavior: Keep old card, log warning. Changes are ignored.
Audit log:
{
"timestamp": "2025-02-26T12:34:56Z",
"level": "warn",
"msg": "agent_card_change_detected",
"agent": "my-agent",
"policy": "alert",
"changes": 2,
"critical": true
}Use case: Production. Require manual review before agent updates.
Config:
agents:
- name: my-agent
url: https://agent.example.com
card_change_policy: alertBehavior: Apply changes immediately, log info entry.
Audit log:
{
"timestamp": "2025-02-26T12:34:56Z",
"level": "info",
"msg": "agent_card_updated",
"agent": "my-agent",
"policy": "auto",
"changes": 2
}Use case: Development. Rolling updates without manual intervention.
Config:
agents:
- name: my-agent
card_change_policy: autoBehavior: Store changes in pending queue. Manual approval via MCP tools. Keeps old card until approved.
When a card change is detected and the policy is approve:
- New card is stored in the pending changes queue
- Old card remains active
- Audit log records pending change
- Operator reviews via MCP tools:
list_pending_changes,approve_card_change,reject_card_change - On approval, new card replaces the old one
- On rejection, pending change is discarded
Config:
agents:
- name: my-agent
card_change_policy: approveMCP approval workflow:
# List pending changes
MCP tool: list_pending_changes
# Approve a specific change
MCP tool: approve_card_change { "agent": "my-agent" }
# Reject a specific change
MCP tool: reject_card_change { "agent": "my-agent" }
What: Validate Agent Card signatures using the agent's JWK (JSON Web Key). When an agent serves its Agent Card as a JWS (JSON Web Signature) compact serialization, sentinel verifies the signature during polling to ensure the card has not been tampered with in transit.
Default: Not required. Optional but recommended for production deployments.
Config:
security:
card_signature:
require: true
trusted_jwks_urls:
- https://agent.example.com/.well-known/jwks.json
cache_ttl: 1hHow it works:
- Agent Card Manager fetches the card from the backend agent
- If the response body is a JWS compact serialization (three base64url-encoded segments separated by dots), sentinel treats it as a signed card
- Sentinel fetches the agent's JWKS from the configured
trusted_jwks_urls - The JWS signature is verified against the JWKS keyset
- The JWS payload is extracted and used as the Agent Card JSON
- JWKS keys are cached for the configured
cache_ttl(default 1 hour) to avoid repeated fetches - If signature verification fails:
require: true— mark card unhealthy, keep previously cached card, log errorrequire: false— log warning, accept unsigned cards but verify signed ones
Trusted JWKS URLs: You can configure multiple JWKS endpoints. Sentinel will try each in order and accept the first successful verification. This supports key rotation scenarios where agents may publish new keys before retiring old ones.
Error on verification failure:
{
"error": {
"code": 401,
"message": "Agent Card signature verification failed",
"hint": "Ensure the agent's JWKS endpoint is reachable and keys are valid",
"docs_url": "https://a2a-sentinel.dev/docs/card-signature"
}
}All requests are logged in OpenTelemetry-compatible structured JSON format. Enables you to track security decisions, debug issues, and audit compliance.
{
"timestamp": "2025-02-26T12:34:56Z",
"level": "info",
"msg": "audit",
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span_id": "00f067aa0ba902b7",
"attributes": {
"a2a.method": "POST",
"a2a.protocol": "json-rpc",
"a2a.target_agent": "my-agent",
"a2a.auth.scheme": "bearer",
"a2a.auth.subject": "user-123",
"a2a.status": "allow",
"a2a.block_reason": "",
"a2a.start_time": "2025-02-26T12:34:56Z"
},
"stream": {
"events": 42,
"duration_ms": 5000
}
}| Field | Meaning |
|---|---|
trace_id |
OpenTelemetry trace ID (for correlation) |
span_id |
OpenTelemetry span ID |
a2a.method |
HTTP method (POST, GET, etc.) |
a2a.protocol |
Protocol detected (json-rpc, rest, agent-card) |
a2a.target_agent |
Agent name matched by router |
a2a.auth.scheme |
Auth scheme (bearer, api-key, none) |
a2a.auth.subject |
Authenticated user ID or "unverified:..." |
a2a.status |
Decision: allow, block |
a2a.block_reason |
If blocked: rate_limit_exceeded, auth_required, forbidden, etc. |
a2a.start_time |
Request start timestamp |
stream.events |
For SSE: number of events sent (if streaming) |
stream.duration_ms |
For SSE: total stream duration in ms |
| Status | Reason | Example |
|---|---|---|
allow |
Request passed all checks | Authenticated, not rate-limited |
block |
Request rejected by security layer | Rate limit hit, auth failed |
| Block Reason | Meaning | HTTP Code |
|---|---|---|
auth_required |
No auth header when required | 401 |
auth_invalid |
Invalid token (signature, expiry, issuer) | 401 |
rate_limit_exceeded |
IP or user rate limit hit | 429 |
global_limit_reached |
Gateway at max capacity | 503 |
forbidden |
Authenticated but lacks permission | 403 |
ssrf_blocked |
Push notification URL blocked | 403 |
replay_detected |
Nonce/timestamp validation failed | 409 |
policy_violation |
ABAC policy rule denied the request | 403 |
By default, ALL log entries are recorded. Configure sampling to reduce noise in high-volume environments:
logging:
audit:
sampling_rate: 0.1 # Log 10% of allowed requests (for volume reduction)
error_sampling_rate: 1.0 # Always log errors/blocks (100%)
max_body_log_size: 1024 # Truncate request bodies to 1KB in logsWhy separate error sampling? Blocks are security events — always log them. Normal traffic can be sampled to reduce log volume.
Example:
- 10,000 allowed requests → only 1,000 logged (10% sampling)
- 10 blocked requests → all 10 logged (100% error sampling)
Find all blocked requests:
cat sentinel.log | jq 'select(.attributes["a2a.status"] == "block")'Find rate limit violations:
cat sentinel.log | jq 'select(.attributes["a2a.block_reason"] == "rate_limit_exceeded")'Find requests by user:
cat sentinel.log | jq 'select(.attributes["a2a.auth.subject"] == "user-123")'Find high-latency SSE streams:
cat sentinel.log | jq 'select(.stream.duration_ms > 30000)'Push notifications allow agents to send updates to clients. However, they create an SSRF (Server-Side Request Forgery) vector if not validated. An attacker could trick the gateway into making requests to internal services by providing a push notification URL that resolves to a private network address.
What: Block push notification URLs that resolve to private networks. Sentinel validates all push notification URLs before making outbound requests.
Default: Enabled (block_private_networks: true)
Config:
security:
push:
block_private_networks: true # Block 10.x, 172.16-31.x, 192.168.x, 127.x, ::1
allowed_domains: [] # Optional: whitelist specific domains
require_https: true # Require HTTPS for push URLs
hmac_secret: "" # Sign webhooks with HMAC-SHA256How it works:
- Client or agent provides a push notification URL
- Sentinel parses the URL and extracts the hostname
- The hostname is resolved to an IP address via DNS
- The resolved IP is checked against blocked private network ranges
- If the URL's hostname matches an entry in
allowed_domains, it is permitted regardless of IP range - If HTTPS is required (
require_https: true), non-HTTPS URLs are rejected - If all checks pass, the push notification request proceeds
Blocked IP ranges:
10.0.0.0/8(Private — RFC 1918)172.16.0.0/12(Private — RFC 1918)192.168.0.0/16(Private — RFC 1918)127.0.0.0/8(Loopback — IPv4)::1/128(Loopback — IPv6)169.254.0.0/16(Link-local — IPv4)fe80::/10(Link-local — IPv6)fc00::/7(Unique local — IPv6)
Error response:
{
"error": {
"code": 403,
"message": "Push notification URL blocked",
"hint": "URL resolves to private network. Use public URLs or configure security.push.allowed_domains",
"docs_url": "https://a2a-sentinel.dev/docs/ssrf"
}
}If you have legitimate internal webhooks, allowlist them:
security:
push:
block_private_networks: true
allowed_domains:
- "internal.company.com" # Allow even if private
- "webhook.service.internal"Algorithm:
- Parse push URL
- Check hostname against
allowed_domains— if match, allow immediately - Resolve hostname to IP address
- If IP in private range → reject with
ErrSSRFBlocked- DNS lookup failure: controlled by
dns_fail_policy(default:block= fail-closed)
- DNS lookup failure: controlled by
- If
require_https: trueand scheme is not HTTPS → reject - Otherwise → allow
Validate webhook authenticity with HMAC-SHA256 signatures:
security:
push:
require_https: true
hmac_secret: "sk_webhook_secret_key"When hmac_secret is configured, sentinel signs outbound push notification requests with an X-Sentinel-Signature header containing the HMAC-SHA256 digest of the request body. Webhook receivers verify the signature to ensure the notification originated from sentinel.
Replay attacks: attacker records a valid request and resends it later to trigger unintended actions.
What: Track unique nonces and validate request timestamps. Reject or warn on requests that have been seen before or are older than the configured window.
Default: Enabled
Config:
security:
replay:
enabled: true
window: 300s # Accept requests ≤5 minutes old
nonce_policy: warn # warn | require
nonce_source: auto # auto | header | jsonrpc-id
clock_skew: 5s # Timestamp clock skew tolerance
store: memory # memory | redis
redis_url: "" # If store: redis
cleanup_interval: 60s # Cleanup expired nonces every 60sNonce policies:
| Policy | Behavior | Use Case |
|---|---|---|
warn |
Log warning if nonce already seen, but still allow the request | Early warning, gradual rollout |
require |
Reject the request if nonce already seen | Strict protection for production |
Nonce sources (nonce_source):
| Source | Behavior |
|---|---|
auto (default) |
Check X-Sentinel-Nonce header first, fall back to JSON-RPC id field |
header |
Only use X-Sentinel-Nonce header (ignore body) |
jsonrpc-id |
Only use JSON-RPC id field from request body |
Timestamp validation:
When the X-Sentinel-Timestamp header is present, sentinel validates that the request is within the replay window:
- Accepts RFC3339 format (e.g.,
2026-02-27T12:00:00Z) or Unix epoch (10-digit, e.g.,1740657600) - Rejects if the timestamp is older than
window(past) or more thanclock_skewinto the future - Without the header,
time.Now()is used (no timestamp validation)
Flow:
- Client includes a unique nonce in the
X-Sentinel-Nonceheader and optionally a timestamp in theX-Sentinel-Timestampheader - Sentinel extracts nonce based on
nonce_sourceconfiguration - If
X-Sentinel-Timestampheader is present, validates timestamp freshness - Checks the nonce against the in-memory nonce store
- If the nonce has been seen before:
warn: Log warning, forward request anyway (never blocks)require: Reject with 429 error
- If the nonce is new: record it in the store with expiry timestamp
- A background goroutine periodically cleans up expired nonces based on
cleanup_interval
Memory management: The in-memory nonce store uses a map with periodic cleanup. Entries older than window are purged every cleanup_interval to prevent unbounded memory growth.
Error response:
{
"error": {
"code": 409,
"message": "Replay attack detected",
"hint": "Include unique nonce and current timestamp in request",
"docs_url": "https://a2a-sentinel.dev/docs/replay"
}
}Client adds to request headers:
X-Sentinel-Nonce: abc123def456xyz789 # Unique nonce (UUID recommended)
X-Sentinel-Timestamp: 2026-02-27T12:00:00Z # Optional: request timestamp (RFC3339)
Gateway validation:
- Extract timestamp → check within window (reject if too old)
- Extract nonce (header > JSON-RPC id based on nonce_source)
- If X-Sentinel-Timestamp present → validate timestamp freshness
- If valid and new → record nonce, forward request
- If duplicate nonce → warn or reject based on
nonce_policy
Example client code:
NONCE=$(uuidgen)
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
curl -X POST http://localhost:8080/agents/echo/ \
-H "X-Sentinel-Nonce: $NONCE" \
-H "X-Sentinel-Timestamp: $TIMESTAMP" \
-H "Content-Type: application/json" \
-d '{"jsonrpc": "2.0", "id": "1", "method": "message/send", ...}'Flush the nonce cache (via MCP):
MCP tool: flush_replay_cache
If a2a-sentinel runs behind a reverse proxy (nginx, load balancer), configure trusted_proxies so rate limiting uses the real client IP, not the proxy IP.
Attacker at 203.0.113.99
↓
Reverse proxy at 10.0.0.1
↓
sentinel (RemoteAddr = 10.0.0.1)
↓
Rate limiter sees: 10.0.0.1, allows 200 req/min
Attacker can send 200 req/min from same real IP but always through proxy
listen:
trusted_proxies:
- "10.0.0.0/8" # Trust our private network
- "203.0.113.5" # Trust specific proxy IPNow sentinel extracts real client IP from X-Forwarded-For header:
X-Forwarded-For: 203.0.113.99, 10.0.0.1
trusted_proxies: [10.0.0.0/8, 203.0.113.5]
Algorithm walks from right to left:
10.0.0.1? → trusted (10.0.0.0/8)
203.0.113.99? → NOT trusted → extract this as real client IP
Result: Rate limiter now sees 203.0.113.99 and enforces 200 req/min per real client.
If trusted_proxies is empty → use RemoteAddr only (don't trust X-Forwarded-For). This is the safest default.
Never trust X-Forwarded-For without explicitly configuring trusted_proxies.
The MCP management server (port 8081, localhost-only) implements MCP 2025-11-25 Streamable HTTP with a 3-state authentication model.
| State | Condition | Access |
|---|---|---|
| Anonymous | No Authorization header |
Read-only tools and resources |
| Authenticated | Valid Authorization: Bearer <token> |
All tools and resources |
| Rejected | Invalid Authorization: Bearer <wrong> |
401 Unauthorized — no fallback |
The key security property: invalid tokens are always rejected (no silent downgrade to anonymous). This prevents token confusion attacks where a misconfigured client might accidentally gain anonymous access.
Anonymous sessions see only 9 read-only tools. Authenticated sessions see all 15 tools (9 read + 6 write). This means:
- Write operations cannot be discovered by unauthenticated clients
- Read tools are intentionally public for monitoring integrations
On initialize, the server returns a Mcp-Session-Id header containing a crypto-random 16-byte hex value. Subsequent requests must include this header. GET and DELETE requests return 405 Method Not Allowed (POST-only per Streamable HTTP spec).
The MCP server always binds to 127.0.0.1 only. It is never reachable from the network, even if listen.address is set to 0.0.0.0. This is an architectural guarantee, not a configuration option.
mcp:
enabled: true
port: 8081
auth:
token: "your-mcp-token"Without auth.token, write tools return a tool-level error (-32001) when called. Read tools and resources remain accessible anonymously regardless.
Every security error includes:
- Code: HTTP status (401, 403, 429, etc.)
- Message: Brief human-readable summary
- Hint: Developer guidance on how to fix (EDUCATIONAL)
- DocsURL: Link to detailed documentation
| Error | Code | Hint | Cause |
|---|---|---|---|
ErrAuthRequired |
401 | "Set Authorization header: 'Bearer '" | No auth header in passthrough-strict or jwt mode |
ErrAuthInvalid |
401 | "Check token expiry and issuer" | JWT signature invalid, expired, wrong issuer/audience |
ErrForbidden |
403 | "Check agent permissions and scope configuration" | Authenticated but lacks permission for this agent |
ErrRateLimited |
429 | "Wait before retrying. Configure security.rate_limit in sentinel.yaml" | IP or user rate limit exceeded |
ErrStreamLimitExceeded |
429 | "Max streams per agent reached. Configure agents[].max_streams" | Too many concurrent SSE streams on this agent |
ErrSSRFBlocked |
403 | "URL resolves to private network. Use public URLs or configure security.push.allowed_domains" | Push notification URL blocks |
ErrReplayDetected |
409 | "Include unique nonce and current timestamp in request" | Nonce already seen or timestamp expired |
ErrGlobalLimitReached |
503 | "Gateway is at maximum connections. Try again shortly" | Gateway-wide rate limit hit |
ErrAgentUnavailable |
503 | "Check agent health with GET /readyz" | Agent unhealthy (failed card fetch, etc.) |
security:
# ── Authentication ──
auth:
mode: passthrough-strict # passthrough | passthrough-strict | jwt | api-key | none
allow_unauthenticated: false # If false, require Authorization header
schemes:
- type: bearer
jwt:
issuer: https://auth.example.com
audience: my-api
jwks_url: https://auth.example.com/.well-known/jwks.json
# ── Rate Limiting ──
rate_limit:
enabled: true
ip:
per_ip: 200 # requests/minute per IP
burst: 50 # allow burst up to 50
cleanup_interval: 5m # remove inactive entries after 5min
user:
per_user: 100 # requests/minute per user
burst: 20 # allow burst up to 20
cleanup_interval: 5m # remove inactive entries after 5min
per_agent: 500 # per-agent limit (not yet enforced)
# ── Agent Card Security ──
card_signature:
require: false # Set true to require JWS-signed Agent Cards
trusted_jwks_urls: [] # URLs to trusted agent JWKS endpoints
cache_ttl: 1h # Cache JWKS keys for this long
# ── Policy Engine (ABAC) ──
policies:
- name: example-policy # Human-readable name
priority: 10 # Evaluation order (lower = first)
effect: deny # allow | deny
conditions:
source_ip: # IP-based conditions
cidr: [] # Match these CIDRs
not_cidr: [] # Match if NOT in these CIDRs
user: [] # Match these users
user_not: [] # Match if user NOT in this list
agent: [] # Match these agent names
method: [] # Match these A2A methods
header: # Match header values (glob patterns)
X-Custom: ["value*"]
header_missing: [] # Match if these headers are absent
time:
within: "" # Time range "HH:MM-HH:MM"
outside: "" # Outside time range
timezone: "UTC" # IANA timezone
days: [] # Day-of-week filter
# ── Push Notification Protection ──
push:
block_private_networks: true # Block private network push URLs (SSRF defense)
allowed_domains: [] # Domains allowed even if resolving to private IPs
require_https: true # Require HTTPS for push notification URLs
dns_fail_policy: block # block (fail-closed) | allow (fail-open) on DNS failures
require_challenge: false # Require challenge verification
hmac_secret: "" # Sign outbound webhooks with HMAC-SHA256
# ── Replay Attack Prevention ──
replay:
enabled: true # Enable nonce + timestamp replay detection
window: 5m # Accept requests within this time window
nonce_policy: warn # warn (log only) | require (reject duplicates)
nonce_source: auto # auto (header > id) | header | jsonrpc-id
clock_skew: 5s # Timestamp clock skew tolerance
store: memory # memory | redis
redis_url: "" # Redis URL if store: redis
cleanup_interval: 60s # Cleanup expired nonces at this intervallisten:
host: 0.0.0.0
port: 8080
max_connections: 1000 # Max total TCP connections
global_rate_limit: 5000 # requests/minute, all traffic
trusted_proxies: [] # IPs/CIDRs to trust X-Forwarded-For from
tls:
cert_file: /path/to/cert.pem
key_file: /path/to/key.pemagents:
- name: my-agent
url: https://agent.example.com
card_path: /.well-known/agent.json
poll_interval: 60s
timeout: 30s
max_streams: 10 # Concurrent SSE streams
allow_insecure: false # Require HTTPS (set true for dev only)
card_change_policy: alert # alert | auto | approve
health_check:
enabled: true
interval: 30sFound a security issue in a2a-sentinel? Please report it responsibly:
Send details to security@a2a-sentinel.dev (to be published):
- Description of vulnerability
- Steps to reproduce
- Potential impact
- Suggested fix (if you have one)
Do NOT:
- Post vulnerability details publicly
- Open GitHub issues for security flaws
- Attempt unauthorized access to systems
Alternatively, use GitHub's private security advisory:
- Go to https://github.com/raeseoklee/a2a-sentinel
- Click "Security" → "Report a vulnerability"
- Fill in details and submit
GitHub will notify maintainers privately. You'll receive updates as the issue is resolved.
- Day 0: You report vulnerability
- Day 1-7: Maintainers acknowledge and begin investigation
- Day 7-14: Fix is developed and tested
- Day 14-21: Fix is released
- Day 21+: Public disclosure (CVE if applicable)
We appreciate your help securing a2a-sentinel for everyone.
- README.md — Quick start, features, architecture
- ARCHITECTURE.md — System architecture and request flow
- ERRORS.md — Error catalog and troubleshooting
- Configuration Reference — Full sentinel.yaml schema with all options
- A2A Protocol Spec — Official A2A specification
Security ON by Default. Built for developers who want protection without complexity.