Summary
Add workflow-level cost budgets with enforcement and optional adaptive model routing — so workflows can cap spending and automatically fall back to cheaper models under budget pressure.
Motivation
Every major AI framework is adding cost-aware execution. Inference cost is confirmed as the dominant ongoing cost driver (not training). Conductor tracks costs today but doesn't enforce limits — an evaluator-optimizer loop with an expensive model can burn through dollars with no guardrails. Research on cost-aware routing (ParetoBandit) and mixture-of-models approaches show that intelligent model selection can match frontier quality at significantly lower cost.
Proposed Design
Workflow-Level Budget
workflow:
runtime:
budget:
max_cost: 5.00 # USD hard cap — workflow aborts if exceeded
warn_at: 3.50 # emit warning event at threshold
Per-Agent Fallback Models
agents:
- name: researcher
model: gpt-5.2
fallback_model: gpt-5.2-mini # used if budget pressure or primary model fails
- name: reviewer
model: claude-opus-4.5
fallback_model: claude-haiku-4.5
Adaptive Routing (Future Extension)
workflow:
runtime:
model_routing:
strategy: cost_aware # or "fixed" (current behavior)
# When cost_aware: switch to fallback_model when remaining budget < estimated agent cost
Behavior
- Budget checked before each agent execution
- If remaining budget goes negative after an agent completes: workflow aborts with
BudgetExceededError
- If remaining budget is less than estimated agent cost and
fallback_model is configured: use fallback model
- Warning event emitted when spend crosses
warn_at threshold
- Budget tracking visible in web dashboard (progress bar or running total)
- Cost estimates use existing pricing table + historical token averages per model
Events
| Event |
When |
budget_warning |
Spend crosses warn_at threshold |
budget_exceeded |
Spend exceeds max_cost — workflow aborting |
model_fallback |
Agent switches to fallback_model due to budget pressure |
Example: Budget-Aware Research Loop
workflow:
name: budget-research
runtime:
budget:
max_cost: 3.00
warn_at: 2.00
default_model: gpt-5.2
entry_point: researcher
agents:
- name: researcher
model: gpt-5.2
fallback_model: gpt-5.2-mini
prompt: "Research the topic provided"
routes:
- to: reviewer
- name: reviewer
model: gpt-5.2
fallback_model: gpt-5.2-mini
prompt: "Review the quality of the research findings"
routes:
- to: researcher
when: "{{ quality_score < 8 }}"
- to: $end
With max_cost: 3.00, the loop runs with gpt-5.2 while budget allows, then switches to gpt-5.2-mini for remaining iterations, and aborts if even the cheap model would exceed the cap.
Why It Fits Conductor
- Declarative, YAML-expressible — doesn't change execution semantics
- Builds on existing
UsageTracker and pricing infrastructure
- Prevents runaway costs in evaluator-optimizer loops (conductor's signature pattern)
fallback_model is a simple per-agent field, consistent with existing model override
- Phased delivery: budget enforcement first (simpler), adaptive routing as follow-up
Effort Estimate
Medium — budget enforcement is straightforward (check before execute, track after). Adaptive routing adds complexity with cost estimation and model switching logic.
Summary
Add workflow-level cost budgets with enforcement and optional adaptive model routing — so workflows can cap spending and automatically fall back to cheaper models under budget pressure.
Motivation
Every major AI framework is adding cost-aware execution. Inference cost is confirmed as the dominant ongoing cost driver (not training). Conductor tracks costs today but doesn't enforce limits — an evaluator-optimizer loop with an expensive model can burn through dollars with no guardrails. Research on cost-aware routing (ParetoBandit) and mixture-of-models approaches show that intelligent model selection can match frontier quality at significantly lower cost.
Proposed Design
Workflow-Level Budget
Per-Agent Fallback Models
Adaptive Routing (Future Extension)
Behavior
BudgetExceededErrorfallback_modelis configured: use fallback modelwarn_atthresholdEvents
budget_warningwarn_atthresholdbudget_exceededmax_cost— workflow abortingmodel_fallbackfallback_modeldue to budget pressureExample: Budget-Aware Research Loop
With
max_cost: 3.00, the loop runs withgpt-5.2while budget allows, then switches togpt-5.2-minifor remaining iterations, and aborts if even the cheap model would exceed the cap.Why It Fits Conductor
UsageTrackerand pricing infrastructurefallback_modelis a simple per-agent field, consistent with existingmodeloverrideEffort Estimate
Medium — budget enforcement is straightforward (check before execute, track after). Adaptive routing adds complexity with cost estimation and model switching logic.