Feature: Output Guardrails & Validation Hooks

## Summary

Add a `guardrails` section to agent definitions for semantic output validation beyond JSON schema type checking — including regex patterns, length limits, and custom script-based checks.

## Motivation

Research shows frontier models spontaneously exhibit deceptive behaviors in multi-agent settings (UC Berkeley/UC Santa Cruz study), 30-50% of AI agents bypass ethical constraints under KPI pressure, and RAG document poisoning can cause fabricated financial data. Conductor validates output *types* today (JSON schema) but has no way to validate output *content* or *semantics*.

## Proposed Design

```yaml
agents:
  - name: financial_analyst
    model: gpt-5.2
    output:
      recommendation:
        type: string
    guardrails:
      - type: regex_deny
        pattern: "(?i)(guaranteed|risk.free|100%)"
        message: "Output contains prohibited financial claims"
      - type: regex_require
        pattern: "(?i)(disclaimer|risk)"
        message: "Output must include risk disclaimer"
      - type: max_length
        chars: 5000
      - type: custom_script
        command: "python validate_output.py"
        # stdin: agent output JSON
        # exit 0 = pass, exit 1 = fail (stderr = failure message)
```

### Behavior on Failure

- Guardrail failure triggers agent re-run with violation feedback injected into prompt
- Configurable `max_guardrail_retries` (default: 2) before hard failure
- Events emitted: `guardrail_check`, `guardrail_pass`, `guardrail_fail`
- Works with retry policies (#80) — guardrail retry is separate from provider error retry

### Built-in Guardrail Types

| Type | Description |
|------|-------------|
| `regex_deny` | Fail if output matches pattern |
| `regex_require` | Fail if output does NOT match pattern |
| `max_length` | Fail if output exceeds character limit |
| `min_length` | Fail if output is below character limit |
| `json_schema` | Validate against an additional JSON schema (beyond output type) |
| `custom_script` | Run external script, pass output via stdin, check exit code |

## Why It Fits Conductor

- Declarative, YAML-expressible — no code changes needed per workflow
- Script-based guardrails reuse existing `script` step infrastructure
- Pairs with retry policies (#80) — guardrail violation → retry with feedback context
- Essential for regulated industries (finance, healthcare) adopting conductor

## Effort Estimate

Medium — new validation layer in `AgentExecutor` post-output, new schema fields, script runner reuse from existing script step infrastructure.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Output Guardrails & Validation Hooks #81

Summary

Motivation

Proposed Design

Behavior on Failure

Built-in Guardrail Types

Why It Fits Conductor

Effort Estimate

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Type	Description
`regex_deny`	Fail if output matches pattern
`regex_require`	Fail if output does NOT match pattern
`max_length`	Fail if output exceeds character limit
`min_length`	Fail if output is below character limit
`json_schema`	Validate against an additional JSON schema (beyond output type)
`custom_script`	Run external script, pass output via stdin, check exit code

Feature: Output Guardrails & Validation Hooks #81

Description

Summary

Motivation

Proposed Design

Behavior on Failure

Built-in Guardrail Types

Why It Fits Conductor

Effort Estimate

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions