Skip to content

feat(BA-5502): Support runtime_variants list in deployment config#10661

Draft
jopemachine wants to merge 11 commits intomainfrom
BA-5502
Draft

feat(BA-5502): Support runtime_variants list in deployment config#10661
jopemachine wants to merge 11 commits intomainfrom
BA-5502

Conversation

@jopemachine
Copy link
Copy Markdown
Member

@jopemachine jopemachine commented Mar 30, 2026

resolves #10660 (BA-5502)

Summary

  • Change DeploymentConfig.runtime_variant (single value) to runtime_variants (list)
  • User's explicit choice always takes priority; validated against the list if present
  • When user does not specify (Sentinel.TOKEN default):
    • Single variant in list → auto-selected
    • Multiple variants → RuntimeVariantNotSpecified error
    • No list → falls back to CUSTOM
  • Both runtime_variant = "vllm" (legacy) and runtime_variants = ["vllm", "sglang"] accepted in deployment config TOML
  • Resolution logic consolidated into DeploymentConfig.resolve_runtime_variant()
  • RuntimeVariantNotAllowed for invalid choice, RuntimeVariantNotSpecified for missing choice

Example deployment-config.toml

runtime_variants = ["vllm", "sglang"]

[environment]
image = "default-image:latest"
architecture = "x86_64"

[vllm]
resource_slots = { cpu = 8 }

[sglang]
resource_slots = { cpu = 12 }

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation

@github-actions github-actions bot added size:XS ~10 LoC size:L 100~500 LoC comp:manager Related to Manager component and removed size:XS ~10 LoC labels Mar 30, 2026
@jopemachine jopemachine changed the title fix(BA-5502): Support forced runtime_variant in service config file feat(BA-5502): Support forced runtime_variant in service config file Mar 30, 2026
@jopemachine jopemachine added this to the 26.4 milestone Mar 31, 2026
@jopemachine jopemachine changed the title feat(BA-5502): Support forced runtime_variant in service config file feat(BA-5502): Support runtime_variants list in service definition Mar 31, 2026
@github-actions github-actions bot added size:XL 500~ LoC size:L 100~500 LoC and removed size:L 100~500 LoC size:XL 500~ LoC labels Mar 31, 2026
@jopemachine jopemachine changed the title feat(BA-5502): Support runtime_variants list in service definition feat(BA-5502): Support runtime_variants list in deployment config Mar 31, 2026
jopemachine and others added 11 commits April 1, 2026 15:27
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change `ModelServiceDefinition.runtime_variant` (single) to
`runtime_variants` (list) so that a service-definition.toml can declare
several allowed runtime variants.

- Single variant in the list: forced automatically (same as before).
- Multiple variants: the user must pick one via the API request;
  `RuntimeVariantNotAllowed` is raised when the choice is invalid.
- Both `runtime_variant = "vllm"` and `runtime_variants = ["vllm", "sglang"]`
  are accepted in TOML (normalized by a Pydantic model_validator).
- Resolution logic lives in `ModelServiceDefinition.resolve_runtime_variant()`
  to avoid duplication across code paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…r priority

Replace CUSTOM default with Sentinel.TOKEN on ExecutionSpec.runtime_variant
so that "user did not specify" and "user chose CUSTOM" are distinguishable.

resolve_runtime_variant now follows: user request > runtime_variants list > CUSTOM default.
- User specified a variant: validate against the list, use it.
- User did not specify (Sentinel): single variant in list → use it,
  multiple → error, no list → fall back to CUSTOM.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract conditions into named properties and methods for clarity
- Add `RuntimeVariantNotSpecified` error for unspecified multi-variant case
- Remove all `type: ignore` by using explicit None checks for narrowing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iority semantics

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d fixtures

- Remove unused `_has_variant_constraint` property
- Consolidate 8 individual test methods into 2 parametrized tests (5 success + 3 error cases)
- Extract `make_draft_with_variant` fixture for draft creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ntime_variant`

Add type narrowing (cast/isinstance) at call sites where
`RuntimeVariant | Sentinel` is passed to APIs expecting `RuntimeVariant`.
These are all post-resolve paths where the value is always `RuntimeVariant`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…e tests

Replace raw tuples with `_ResolveVariantTestCase` dataclass for improved readability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:manager Related to Manager component size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support forced runtime_variant in deployment config file

1 participant