Skip to content

Feature: Cost Budgets & Adaptive Model Routing #84

@jrob5756

Description

@jrob5756

Summary

Add workflow-level cost budgets with enforcement and optional adaptive model routing — so workflows can cap spending and automatically fall back to cheaper models under budget pressure.

Motivation

Every major AI framework is adding cost-aware execution. Inference cost is confirmed as the dominant ongoing cost driver (not training). Conductor tracks costs today but doesn't enforce limits — an evaluator-optimizer loop with an expensive model can burn through dollars with no guardrails. Research on cost-aware routing (ParetoBandit) and mixture-of-models approaches show that intelligent model selection can match frontier quality at significantly lower cost.

Proposed Design

Workflow-Level Budget

workflow:
  runtime:
    budget:
      max_cost: 5.00              # USD hard cap — workflow aborts if exceeded
      warn_at: 3.50               # emit warning event at threshold

Per-Agent Fallback Models

agents:
  - name: researcher
    model: gpt-5.2
    fallback_model: gpt-5.2-mini  # used if budget pressure or primary model fails

  - name: reviewer
    model: claude-opus-4.5
    fallback_model: claude-haiku-4.5

Adaptive Routing (Future Extension)

workflow:
  runtime:
    model_routing:
      strategy: cost_aware        # or "fixed" (current behavior)
      # When cost_aware: switch to fallback_model when remaining budget < estimated agent cost

Behavior

  • Budget checked before each agent execution
  • If remaining budget goes negative after an agent completes: workflow aborts with BudgetExceededError
  • If remaining budget is less than estimated agent cost and fallback_model is configured: use fallback model
  • Warning event emitted when spend crosses warn_at threshold
  • Budget tracking visible in web dashboard (progress bar or running total)
  • Cost estimates use existing pricing table + historical token averages per model

Events

Event When
budget_warning Spend crosses warn_at threshold
budget_exceeded Spend exceeds max_cost — workflow aborting
model_fallback Agent switches to fallback_model due to budget pressure

Example: Budget-Aware Research Loop

workflow:
  name: budget-research
  runtime:
    budget:
      max_cost: 3.00
      warn_at: 2.00
    default_model: gpt-5.2
  entry_point: researcher

agents:
  - name: researcher
    model: gpt-5.2
    fallback_model: gpt-5.2-mini
    prompt: "Research the topic provided"
    routes:
      - to: reviewer

  - name: reviewer
    model: gpt-5.2
    fallback_model: gpt-5.2-mini
    prompt: "Review the quality of the research findings"
    routes:
      - to: researcher
        when: "{{ quality_score < 8 }}"
      - to: $end

With max_cost: 3.00, the loop runs with gpt-5.2 while budget allows, then switches to gpt-5.2-mini for remaining iterations, and aborts if even the cheap model would exceed the cap.

Why It Fits Conductor

  • Declarative, YAML-expressible — doesn't change execution semantics
  • Builds on existing UsageTracker and pricing infrastructure
  • Prevents runaway costs in evaluator-optimizer loops (conductor's signature pattern)
  • fallback_model is a simple per-agent field, consistent with existing model override
  • Phased delivery: budget enforcement first (simpler), adaptive routing as follow-up

Effort Estimate

Medium — budget enforcement is straightforward (check before execute, track after). Adaptive routing adds complexity with cost estimation and model switching logic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions