feat(router): add LLM routing with cost optimization and pretrained configs#476
feat(router): add LLM routing with cost optimization and pretrained configs#476
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 18 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
rbs333
left a comment
There was a problem hiding this comment.
I think we still have some disconnects on the design of this feature. Let's maybe set up some time to talk through it.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 18 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 18 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
|
@rbs333 Reworked the PR based on your feedback. Here's where things stand: 1. No separate class — SemanticRouter is the API All LLM routing logic (cost optimization, confidence scoring, 2. "Tier" → "Route"
3. Callable pattern
4. from redisvl.extensions.router import SemanticRouter
router = SemanticRouter.from_pretrained("default", redis_url="redis://localhost:6379")
match = router("hello") # -> RouteMatch(name="simple", model="openai/gpt-4.1-nano", ...)5. Dead code removed
6. Notebook fully rewritten Every cell uses |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 14 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…onfigs Extend SemanticRouter to support LLM model selection by adding optional model, confidence, cost optimization, and multi-match capabilities to the existing routing infrastructure. When a Route includes a `model` field, the router returns the LiteLLM- compatible model identifier alongside the match, with a confidence score derived from vector distance. Cost-optimized routing biases toward cheaper models when semantic distances are close, using a configurable cost_weight penalty. Key additions to SemanticRouter: - Route.model (optional) for LiteLLM model identifiers - RouteMatch.confidence, .alternatives, .metadata fields - RoutingConfig.cost_optimization and .cost_weight settings - RoutingConfig.default_route for fallback when no match found - from_pretrained() to load routers with pre-computed embeddings - export_with_embeddings() to serialize routers with vectors - AsyncSemanticRouter with full async parity A built-in "default" pretrained config ships with 3 tiers (simple, standard, expert) mapped to GPT-4.1 Nano, Claude Sonnet 4.5, and Claude Opus 4.5, using pre-computed sentence-transformers embeddings. Backward compatibility: - LLMRouter/AsyncLLMRouter provided as deprecated wrappers - ModelTier subclass enforces required model field - Legacy field names (tiers/default_tier) mapped bidirectionally - Existing SemanticRouter usage is fully unaffected Includes integration tests, unit tests for schema validation, a user guide notebook, and a pretrained config generation script.
rbs333
left a comment
There was a problem hiding this comment.
Looking good! I think now it just needs an update on the docs and then a quick latency check to make sure we're not adding overhead vs previous router lookup speed.
| last_error = e | ||
| if attempt < 2: # Don't download on last attempt | ||
| try: | ||
| nltk.download("stopwords", quiet=True) |
There was a problem hiding this comment.
are these changes for the router?

Extends
SemanticRouterwith LLM model selection, cost-optimized routing, and pretrained configurations — routing queries to the right model using Redis vector search.Design
LLM routing is integrated directly into
SemanticRouter. When aRouteincludes an optionalmodelfield, the router returns the LiteLLM-compatible model identifier alongside the match, with a confidence score derived from vector distance (1 - distance/2).Schema extensions (all optional, no breaking changes):
Routegainsmodel: Optional[str]andmetadata: DictRouteMatchgainsmodel,confidence,alternatives, andmetadatafieldsRoutingConfiggainscost_optimization,cost_weight, anddefault_routerouter(query)returns aRouteMatchroute_many()returns multiple ranked matchesAsyncSemanticRouterRoutes without a
modelfield work exactly as before — existing SemanticRouter usage is unaffected.Usage
Basic LLM routing:
Pretrained config — ships with a 3-route config (simple/standard/expert) mapped to Bloom's Taxonomy levels, with pre-computed
sentence-transformers/all-mpnet-base-v2embeddings:Cost-optimized routing — when multiple routes match with similar distances, a cost penalty biases toward cheaper models:
Async:
Export/import with embeddings:
Files changed
redisvl/extensions/router/semantic.py,schema.py,__init__.pyredisvl/extensions/router/pretrained/__init__.py,default.jsonredisvl/query/query.py,redisvl/utils/full_text_query_helper.pytests/unit/test_llm_router_schema.py,tests/integration/conftest.py,tests/unit/conftest.pydocs/user_guide/13_llm_router.ipynbscripts/generate_pretrained_config.py