Skip to content

Add SurrealDB backend and v1 proxy API#2

Merged
CLoaKY233 merged 16 commits intomainfrom
phase-1-mvp
Mar 27, 2026
Merged

Add SurrealDB backend and v1 proxy API#2
CLoaKY233 merged 16 commits intomainfrom
phase-1-mvp

Conversation

@CLoaKY233
Copy link
Copy Markdown
Owner

This pull request introduces a new valygate-surrealdb crate, adds SurrealDB schema definitions for core entities, and establishes foundational Rust modules for database configuration, error handling, cryptography, and data models. It also updates the workspace and dependencies to support these features.

The most important changes are:

SurrealDB Integration and Schema

  • Added the new crate valygate-surrealdb with its own Cargo.toml, registering it in the workspace and adding all required dependencies for SurrealDB integration, cryptography, and async support. (Cargo.toml, crates/surrealdb/Cargo.toml) [1] [2] [3]
  • Introduced SurrealDB schema files defining tables and fields for user, virtual_api_key, provider_credential, model_catalog, and request_log, including permissions and unique indexes. (crates/surrealdb/schema/auth/001_account.surql, 002_virtual_api_key.surql, crates/surrealdb/schema/model/001_model_catalog.surql, crates/surrealdb/schema/provider/001_provider_credential.surql, crates/surrealdb/schema/request/001_request_log.surql, crates/surrealdb/schema/user/001_user.surql) [1] [2] [3] [4] [5] [6]

Core Rust Modules for SurrealDB Crate

  • Added config.rs for SurrealDB connection configuration, including validation logic.
  • Added error.rs defining a custom DatabaseError enum for configuration, database, schema, and crypto errors.
  • Added crypto.rs with functions for encrypting/decrypting secrets, generating and hashing virtual API keys, and related tests.
  • Added models.rs defining Rust structs for all major database entities and API input types, mirroring the SurrealDB schema.

Project Structure and Skills

  • Updated the workspace members to explicitly list core crates and added new dependencies to support async, crypto, and SurrealDB operations. [1] [2]
  • Added skill references for Rust async patterns, best practices, and engineering to the .claude/skills directory. [1] [2] [3]

These changes lay the groundwork for SurrealDB-backed features, secure credential handling, and a robust Rust data model for the application.

Introduce valygate-surrealdb crate (models, schema, crypto,
errors) and a proxy service with OpenAI/Anthropic adapters plus
v1 HTTP handlers. Update workspace members, routing, and .gitignore.
@CLoaKY233 CLoaKY233 requested a review from Copilot March 23, 2026 19:00
@CLoaKY233 CLoaKY233 self-assigned this Mar 23, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 23, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • User authentication system with account signup and signin
    • Virtual API key management for service access control
    • Provider credential management for connecting to LLM services
    • OpenAI and Anthropic-compatible chat completions API with streaming support
    • Model catalog with capability discovery and filtering
    • Request logging and usage tracking
  • Documentation

    • Added comprehensive contribution guidelines and development setup instructions
    • Added Docker support for containerized deployment
    • Added security policy and code of conduct
    • Added GitHub issue and pull request templates

Walkthrough

Complete restructuring of the codebase from "ValyGate" to "ValyMux" including a new SurrealDB-backed authentication and management layer, RESTful API endpoints for user/credential/model operations, provider-agnostic chat completion proxying with OpenAI and Anthropic adapters, request logging, and comprehensive CI/CD infrastructure and documentation.

Changes

Cohort / File(s) Summary
Project Rename
Cargo.toml, README.md, crates/core/Cargo.toml, crates/telemetry/src/config.rs, src/rts/root.rs, src/main.rs, CLAUDE.md, .gitignore
Updated all references from "ValyGate" to "ValyMux". Renamed root crate, updated package names, service identifiers, and logging prefixes. Extended .gitignore to exclude .agents, /skills, .claude, and documents paths.
SurrealDB Integration
crates/surrealdb/*, crates/surrealdb/src/config.rs, crates/surrealdb/src/crypto.rs, crates/surrealdb/src/error.rs, crates/surrealdb/src/lib.rs, crates/surrealdb/src/models.rs, crates/surrealdb/src/schema.rs, crates/surrealdb/schema/...
New valymux-surrealdb crate with database facade, encryption/decryption helpers for provider credentials, config validation with pbkdf2 key derivation support, comprehensive models (User, VirtualApiKey, ProviderCredential, ModelCatalog, RequestLog), error types, and 8 SurQL schema files defining auth, API keys, models, providers, and request logging tables with permissions tied to authenticated users.
REST API & Authentication
src/rts/v1.rs, src/rts/extractors.rs, src/rts/mod.rs
New v1 REST router with endpoints for user signup/signin, profile updates, provider credential CRUD, virtual API key CRUD, model listing, and chat completion proxy at /v1/chat/completions. Added RequireAuth extractor for bearer-token authentication against the database.
Chat Proxy Service
src/svc/proxy/mod.rs, src/svc/proxy/openai.rs, src/svc/proxy/anthropic.rs, src/svc/proxy/types.rs, src/svc/mod.rs
New ProxyExecutor implementing canonical request normalization, provider adapter pattern, model capability validation, and provider-specific request/response translation. Concrete adapters for OpenAI and Anthropic with streaming support, usage extraction, error handling, and request logging. Canonical request type normalizes incoming JSON with provider-specific option extraction.
Configuration & Setup
Cargo.toml, src/sys/config.rs, src/sys/state.rs, src/sys/init.rs, src/sys/client.rs, src/sys/mod.rs, crates/core/src/error.rs
Extended Cargo.toml with 25\+ new dependencies (SurrealDB, crypto, async utilities). Updated AppConfig with SurrealDB connection fields, custom Debug impl, and database_config() helper. Modified AppState to hold database and trait-bound HTTP client. Updated initialization to connect/bootstrap SurrealDB. Added HttpClient trait for testability. Added AppError::NotFound variant.
Environment & Build Configuration
.env.example, .cargo/config.toml, clippy.toml, rustfmt.toml, deny.toml, Dockerfile, .dockerignore
Added environment variable template with SurrealDB and encryption key placeholders. Configured Cargo with aliases, dev/test incremental compilation. Set Clippy MSRV to 1.85 with complexity limits. Configured rustfmt with 100-char line width and 2024 edition. Added cargo-deny license/advisory policy. Created multi-stage Dockerfile with dependency caching and non-root runtime.
GitHub & Contributing Docs
.github/CODEOWNERS, .github/ISSUE_TEMPLATE/bug_report.yml, .github/ISSUE_TEMPLATE/config.yml, .github/ISSUE_TEMPLATE/feature_request.yml, .github/PULL_REQUEST_TEMPLATE.md, CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md, AGENTS.md
Established governance with CODEOWNERS, issue templates for bugs/features, PR template with checklists, comprehensive contributing guide (setup, conventions, commit style), code of conduct referencing Contributor Covenant v2.1, security policy with private reporting via GitHub Security Advisories, and agent guidelines for repo-wide practices.
CI/CD Workflows
.github/workflows/ci.yml, .github/workflows/audit.yml, .github/workflows/release.yml
Added GitHub Actions for continuous format/clippy/test checks on push/PR, security audits on Cargo.lock changes and scheduled Monday runs, and multi-platform release builds (Linux musl/macOS/Windows) with artifact staging and GitHub release creation with changelog extraction.
Documentation & Changelog
CHANGELOG.md, README.md, skills-lock.json, src/lib.rs
Added changelog tracking new proxy/auth/encryption/telemetry features and ValyGate→ValyMux rename. Expanded README with quick-start, Docker setup, configuration table, OpenAI API compatibility details, project structure, and license section. Pinned Rust skill versions via skills-lock.json. Exported new svc module in crate root.

Sequence Diagram

sequenceDiagram
    participant Client
    participant API as V1 Router<br/>/v1/chat/completions
    participant DB as Database<br/>(SurrealDB)
    participant Proxy as ProxyExecutor
    participant Adapter as Provider Adapter<br/>(OpenAI/Anthropic)
    participant Upstream as Upstream<br/>Provider API

    Client->>API: POST with bearer token<br/>& chat request
    API->>API: extract_bearer_token()
    API->>DB: authenticate_user(token)
    DB-->>API: User
    API->>API: parse JSON to<br/>CanonicalChatRequest
    API->>Proxy: execute_chat_completion()
    Proxy->>DB: resolve_proxy_route()<br/>(key, model)
    DB-->>Proxy: ResolvedProxyRoute<br/>(user, key, model,<br/>provider_credential)
    Proxy->>Proxy: validate_model_capabilities()
    Proxy->>DB: decrypt_provider_api_key()
    DB-->>Proxy: decrypted_api_key
    Proxy->>Adapter: validate_request()
    Proxy->>Adapter: prepare_body()
    Adapter-->>Proxy: provider_json
    Proxy->>Adapter: apply_headers()
    Adapter-->>Proxy: RequestBuilder
    Proxy->>Upstream: execute HTTP request
    alt streaming
        Upstream-->>Proxy: Stream (SSE)
        Proxy->>Adapter: translate_stream_response()
        Adapter-->>Proxy: OpenAI SSE Body
        Proxy->>DB: log_request()<br/>(streaming=true)
        Proxy-->>Client: Response Body (SSE)
    else non-streaming
        Upstream-->>Proxy: JSON response
        Proxy->>Adapter: translate_json_response()
        Adapter-->>Proxy: OpenAI JSON
        Proxy->>Adapter: extract_usage()
        Proxy->>DB: log_request()<br/>(with usage)
        Proxy-->>Client: JSON response
    end
Loading
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.58% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'Add SurrealDB backend and v1 proxy API' is clearly and specifically related to the major changeset, capturing the main additions of database integration and proxy API functionality.
Description check ✅ Passed The PR description is well-structured and directly related to the changeset, covering SurrealDB integration, schema definitions, and Rust modules that align with the code changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch phase-1-mvp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

This comment was marked as outdated.

coderabbitai[bot]

This comment was marked as outdated.

- Add pbkdf2 crate and workspace entry; bump deps (rand, surrealdb,
  uuid,
  bytes, futures-util) and update Cargo.lock
- Parse surreal_encryption_key as 32-byte hex, base64 or
  pbkdf2$<iters>$<base64-salt>$<passphrase>; derive key and redact
  config
  Debug output; add unit tests
- Change crypto APIs to accept fixed 32-byte key and store
  encryption_key
  on Database
- Make schema IF NOT EXISTS check case-insensitive
- Add tracing/instrumentation and richer logging to rts v1 handlers
- Extend CanonicalChatRequest with message/penalty/stop/etc. fields and
  update OpenAI/Anthropic adapters and proxy executor to honor provider
  options and passthrough keys
coderabbitai[bot]

This comment was marked as resolved.

- Add surrealdb-types dependency and workspace entries
- Replace surrealdb::sql::Thing with surrealdb::types::RecordId
- Derive SurrealValue where needed and use ToSql for ID serialization
- Use surrealdb Token (token.access.into_insecure_token()) for auth
- Add AppError::NotFound and return 404 for missing resources
- Switch crypto RNG to SysRng with TryRng error handling
- Add upstream_status_code_or_warn to validate upstream status codes
- Restrict provider option keys for OpenAI/Anthropic and remove a
  redundant validation call
- Update Axum route params from :param to {param} and add a base64
  encryption key test
coderabbitai[bot]

This comment was marked as resolved.

Return Result from generate_virtual_api_key and propagate errors to
callers (including tests). Add rust-version = "1.85" to the surrealdb
crate and remove the pbkdf2 workspace dependency (and its Cargo.lock
entry). Add a TODO in the Anthropic adapter to handle structured
content in the future.
Extract SeedModel and seed helpers out of seed_model_catalog and
reuse them. Add clippy allows for excessive bools and missing error
docs. Add /// # Errors docs to several APIs. Use a float-tolerance
check for temperature comparisons. Simplify SysRng error formatting,
change chunk/merge_provider_options signatures, and tighten upstream
status handling and adapter validation/translation behavior.
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.gitignore:
- Line 5: The .gitignore entry ignores the entire .claude directory which
prevents tracking .claude/skills files; update the ignore rules so
.claude/skills and its contents are not ignored (e.g., remove the blanket
".claude" rule or replace it with an allowlist that keeps .claude/skills and its
subfiles like .claude/skills/** while still ignoring other unwanted files under
.claude) so the PR's shipped skills remain versioned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 68a94b8d-3e9b-4ebd-86b8-64638ea03009

📥 Commits

Reviewing files that changed from the base of the PR and between b153c83 and 8f97fdd.

📒 Files selected for processing (1)
  • .gitignore
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test (macos-latest)
  • GitHub Check: Test (ubuntu-latest)

.env
.agents
/skills
.claude
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

.claude blanket ignore likely blocks versioning of required .claude/skills assets

Ignoring the entire .claude directory conflicts with the PR objective that ships files under .claude/skills. This can silently prevent future updates to those tracked skill files.

Suggested fix
 .agents
 /skills
-.claude
+.claude/*
+!.claude/skills/
+!.claude/skills/**
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.claude
.agents
/skills
.claude/*
!.claude/skills/
!.claude/skills/**
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore at line 5, The .gitignore entry ignores the entire .claude
directory which prevents tracking .claude/skills files; update the ignore rules
so .claude/skills and its contents are not ignored (e.g., remove the blanket
".claude" rule or replace it with an allowlist that keeps .claude/skills and its
subfiles like .claude/skills/** while still ignoring other unwanted files under
.claude) so the PR's shipped skills remain versioned.

- Simplify and tighten GitHub workflows (permissions, fewer jobs,
  consolidated format/clippy/test steps, release checksum and changelog
  improvements)
- Add DatabaseError::NotFound and propagate it to AppError::NotFound
  where appropriate
- Implement Debug for sensitive models to redact API keys and tokens
- Inject a shared reqwest::Client into AppState and use it for proxy
  requests
- Improve auth handling and observability: log auth failures, ensure
  accounts are enabled, and compute token fingerprints with SHA256
- Add sha2/hex deps, set rust-version, and bump Docker base image to
  rust:1.88
Remove several rustfmt options: imports_granularity, group_imports,
comment_width, wrap_comments, trailing_comma, trailing_semicolon,
blank_lines_upper_bound, and blank_lines_lower_bound. Keep edition,
max_width, and use_field_init_shorthand.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sys/init.rs (1)

40-46: ⚠️ Potential issue | 🟠 Major

Move the total timeout away from the shared client to prevent cutting off streaming responses.

The reqwest::ClientBuilder::timeout(...) on line 44 applies to the full request/response lifetime. Since this client is reused by the proxy layer for /v1/chat/completions streaming responses (which use upstream.bytes_stream()), any response exceeding http_timeout_secs will be abruptly cut off. Keep the connect_timeout on line 43, but apply the total timeout only to non-streaming request paths or use per-chunk read timeouts for streams to handle stalled connections without timing out legitimate long-lived responses.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sys/init.rs` around lines 40 - 46, The shared reqwest Client built into
variable http_client currently sets a global request timeout via
Client::builder().timeout(...), which can abort long-lived streaming responses
(e.g., proxying /v1/chat/completions). Remove the global timeout call from
Client::builder() and keep only connect_timeout(...); then enforce total-request
timeouts only at call sites that perform non-streaming requests or implement
per-stream read timeouts / per-chunk timeouts for streaming proxies (use the
config fields http_timeout_secs for non-streaming callers or wrap individual
Requests with a timeout/futures::timeout where appropriate). Update references
to Client::builder(), .timeout(...), .connect_timeout(...), and the http_client
usage to reflect per-request timeout handling.
♻️ Duplicate comments (3)
Dockerfile (1)

6-6: ⚠️ Potential issue | 🔴 Critical

Bump the builder image to a SurrealDB-compatible Rust version.

The SurrealDB Rust SDK docs require Rust 1.89 or greater, but the deps stage is pinned to rust:1.88-slim-bookworm. Because builder inherits from deps, the container build can fail before your code is compiled. (surrealdb.com)

⬆️ Suggested patch
-FROM rust:1.88-slim-bookworm AS deps
+FROM rust:1.89-slim-bookworm AS deps
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` at line 6, The deps build stage is pinned to
rust:1.88-slim-bookworm which is incompatible with SurrealDB's requirement of
Rust >=1.89; update the FROM line in the Dockerfile for the deps stage (and any
inherited builder stage) to use rust:1.89-slim-bookworm or a newer compatible
Rust image so the builder inherits a SurrealDB-compatible toolchain.
.github/workflows/release.yml (1)

17-18: ⚠️ Potential issue | 🟠 Major

Drop repository write access from the build job.

This job only checks out code, compiles, and uploads artifacts. contents: write is broader than necessary; keep write scope on the release job only.

🔒 Suggested patch
     permissions:
-      contents: write
+      contents: read
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 17 - 18, The build job's
permissions block currently grants "contents: write" which is too broad; change
the permissions entry in the build job's permissions block from "contents:
write" to "contents: read" (or remove the write scope) so the job only needs
read access, and ensure that the release job still retains "contents: write"
where needed; update the "permissions" block that contains the "contents: write"
entry accordingly and verify no other jobs rely on repository write access.
src/svc/proxy/anthropic.rs (1)

418-433: ⚠️ Potential issue | 🟡 Minor

Require the documented shapes for stop_sequences, metadata, and thinking.

The current logic only bans arrays/objects for other keys; it never asserts that stop_sequences is an array or that metadata / thinking are objects. Scalars like "stop_sequences": "END" will pass validation and get sent upstream.

🩹 Suggested validation
         if !ANTHROPIC_ALLOWED_PROVIDER_KEYS.contains(&key.as_str()) {
             bail!("unsupported anthropic provider option `{key}`");
         }

+        match key.as_str() {
+            "stop_sequences" if !value.is_array() => {
+                bail!("providerOptions.anthropic.stop_sequences must be an array");
+            }
+            "metadata" | "thinking" if !value.is_object() => {
+                bail!("providerOptions.anthropic.{key} must be an object");
+            }
+            _ => {}
+        }
+
         if key != "stop_sequences" && value.is_array() {
             bail!("unsupported anthropic provider option `{key}`");
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/svc/proxy/anthropic.rs` around lines 418 - 433, The validation currently
checks only that most keys aren't arrays/objects but fails to enforce required
shapes for "stop_sequences" (must be an array) and "metadata"/"thinking" (must
be objects); update the loop over provider_options so that when key ==
"stop_sequences" you assert value.is_array() and bail with a clear message if
not, and when matches!(key.as_str(), "metadata" | "thinking") you assert
value.is_object() and bail if not; keep existing checks for reserved and allowed
keys (ANTHROPIC_RESERVED_PROVIDER_KEYS, ANTHROPIC_ALLOWED_PROVIDER_KEYS) and
preserve the existing error style/strings used by the function to maintain
consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/audit.yml:
- Around line 22-25: Add a CI step that installs and runs cargo-deny to enforce
deny.toml: install cargo-deny (e.g., via cargo install cargo-deny) and run cargo
deny check --config deny.toml (or just cargo deny check if config auto-detected)
as a new step in the audit workflow after the cargo audit step so the deny.toml
policy (bans, licenses, provenance) is validated in CI.

In @.github/workflows/ci.yml:
- Line 40: CI runs invoke cargo commands (e.g., the step using "cargo clippy
--all-targets --all-features -- -D warnings") without --locked, which lets a
stale Cargo.lock slip through; update all cargo invocations in the workflow (at
least the "cargo clippy ..." step and any other "cargo build"/"cargo test"
steps) to include the --locked flag so the CI will fail if Cargo.lock is out of
sync with Cargo.toml.

In @.github/workflows/release.yml:
- Around line 70-76: The release workflow's build steps ("Build (native)" and
"Build (cross)") call `cargo build --release --target ${ matrix.target }` and
`cross build --release --target ${ matrix.target }` without the `--locked` flag;
update those commands to include `--locked` (i.e., `cargo build --locked
--release --target ...` and `cross build --locked --release --target ...`) so
the builds honor the committed Cargo.lock and ensure reproducible release
artifacts.
- Around line 122-129: Update the changelog extraction step (step id changelog)
to match the release header against the actual tag by first deriving VERSION
from github.ref_name (strip leading "v") into a shell variable (e.g.
VERSION=$(echo "${GITHUB_REF_NAME#v}")), then run awk that looks specifically
for "^## [VERSION]" (use the shell-expanded VERSION in the awk pattern) to
extract the section for that version, writing to release_notes.md; keep the
existing fallback that writes "No release notes found in CHANGELOG.md" when
release_notes.md is empty.

In `@Cargo.toml`:
- Line 5: Update the workspace Rust minimum supported version to 1.89: change
the rust-version entry named rust-version in Cargo.toml from "1.85" to "1.89";
update the clippy configuration entry that pins the toolchain (the rust
toolchain/version entry in clippy.toml) to 1.89 as well; and update the
Dockerfile base image line that selects the Rust toolchain (the FROM or rust
image tag) to use a Rust 1.89 image so the container matches the workspace
requirement.

In `@crates/surrealdb/src/lib.rs`:
- Around line 144-146: Replace misclassified DatabaseError::InvalidConfig usages
for request-time auth/permission failures with proper semantic variants and
return those from the auth/allowlist checks: add new DatabaseError variants
(e.g., Unauthorized, Forbidden, BadRequest) to the DatabaseError enum and use
Unauthorized for invalid API keys/credentials, Forbidden for disabled accounts
or model-allowlist denials, and BadRequest for malformed request conditions;
update the checks that currently do `if !user.enabled { return
Err(DatabaseError::InvalidConfig(...)) }` and the other occurrences mentioned
(around the 160-164, 483-493, 506-509 ranges) to return the appropriate new
variant, and then map these new variants to HTTP statuses in the HTTP/Axum error
mapping layer instead of relying on string matching.

In `@src/rts/v1.rs`:
- Around line 171-174: The handler currently maps all errors from
state.database.signup_user and signin_user to internal_error, causing expected
auth failures to become 500s; change the map_err closures on
state.database.signup_user(...) and state.database.signin_user(...) to match the
database error variants and convert them to the appropriate
crates::core::AppError (e.g., BadRequest or Conflict for signup constraint
violations, Unauthorized or Forbidden for signin bad credentials/disabled
accounts) instead of always calling internal_error, ensuring the returned
AppError is propagated from the handler so Axum returns the correct 4xx
response.

In `@src/svc/proxy/anthropic.rs`:
- Around line 184-193: The code currently decodes each SSE chunk with
String::from_utf8_lossy as it arrives from upstream_stream in the loop, which
can corrupt multibyte UTF‑8 fragments; instead, read and append raw bytes (e.g.,
push bytes into a Vec<u8> or BytesMut) rather than converting each chunk to
String, search for the SSE/event delimiter in the raw buffer, and only call
String::from_utf8_lossy or from_utf8 on the complete event slice before passing
it to handle_event(); update the loop that consumes upstream_stream and the
buffer variable handling so decoding happens after a full event boundary is
assembled.

In `@src/svc/proxy/mod.rs`:
- Around line 92-123: The SSE translation currently runs whenever
canonical_request.stream is true, which lets translate_stream_response try to
parse upstream error bodies as SSE; fix by gating the SSE branch: after checking
canonical_request.stream and adapter.stream_headers(), also verify the upstream
returned a successful status (e.g., status_code.is_success()) and that the
headers/content-type indicate "text/event-stream" (via adapter.stream_headers()
or the returned headers) before calling adapter.translate_stream_response; if
the upstream status is non-success or content-type is not event-stream, skip
translate_stream_response and instead follow the normal JSON/error path
(persist_request_log_or_warn with the error info and return the
upstream_status_code_or_warn(...) response) rather than collapsing the error
into an empty [DONE] stream.
- Around line 157-205: validate_model_capabilities currently doesn't enforce
system/thinking flags; update it to reject or enforce requests based on
route.model.supports_system_messages, route.model.supports_thinking, and
route.model.thinking_required: inspect the CanonicalChatRequest for any
system-role messages (e.g., scan request.messages for role == "system") and if
any are present but route.model.supports_system_messages is false, anyhow::bail;
detect use of "thinking" (check request for a dedicated thinking flag or
provider options such as request.provider_options.anthropic.thinking) and if
thinking is present but route.model.supports_thinking is false, anyhow::bail; if
route.model.thinking_required is true then require that thinking is present on
the request and otherwise bail; keep these checks alongside the existing
temperature/stream/tool validations in validate_model_capabilities so the
catalog properly gates system/thinking combinations.

---

Outside diff comments:
In `@src/sys/init.rs`:
- Around line 40-46: The shared reqwest Client built into variable http_client
currently sets a global request timeout via Client::builder().timeout(...),
which can abort long-lived streaming responses (e.g., proxying
/v1/chat/completions). Remove the global timeout call from Client::builder() and
keep only connect_timeout(...); then enforce total-request timeouts only at call
sites that perform non-streaming requests or implement per-stream read timeouts
/ per-chunk timeouts for streaming proxies (use the config fields
http_timeout_secs for non-streaming callers or wrap individual Requests with a
timeout/futures::timeout where appropriate). Update references to
Client::builder(), .timeout(...), .connect_timeout(...), and the http_client
usage to reflect per-request timeout handling.

---

Duplicate comments:
In @.github/workflows/release.yml:
- Around line 17-18: The build job's permissions block currently grants
"contents: write" which is too broad; change the permissions entry in the build
job's permissions block from "contents: write" to "contents: read" (or remove
the write scope) so the job only needs read access, and ensure that the release
job still retains "contents: write" where needed; update the "permissions" block
that contains the "contents: write" entry accordingly and verify no other jobs
rely on repository write access.

In `@Dockerfile`:
- Line 6: The deps build stage is pinned to rust:1.88-slim-bookworm which is
incompatible with SurrealDB's requirement of Rust >=1.89; update the FROM line
in the Dockerfile for the deps stage (and any inherited builder stage) to use
rust:1.89-slim-bookworm or a newer compatible Rust image so the builder inherits
a SurrealDB-compatible toolchain.

In `@src/svc/proxy/anthropic.rs`:
- Around line 418-433: The validation currently checks only that most keys
aren't arrays/objects but fails to enforce required shapes for "stop_sequences"
(must be an array) and "metadata"/"thinking" (must be objects); update the loop
over provider_options so that when key == "stop_sequences" you assert
value.is_array() and bail with a clear message if not, and when
matches!(key.as_str(), "metadata" | "thinking") you assert value.is_object() and
bail if not; keep existing checks for reserved and allowed keys
(ANTHROPIC_RESERVED_PROVIDER_KEYS, ANTHROPIC_ALLOWED_PROVIDER_KEYS) and preserve
the existing error style/strings used by the function to maintain consistency.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 044e1009-dafe-40e9-ba47-3f4c2601ec14

📥 Commits

Reviewing files that changed from the base of the PR and between 8f97fdd and a09bade.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (17)
  • .github/workflows/audit.yml
  • .github/workflows/ci.yml
  • .github/workflows/release.yml
  • CLAUDE.md
  • Cargo.toml
  • Dockerfile
  • clippy.toml
  • crates/surrealdb/src/error.rs
  • crates/surrealdb/src/lib.rs
  • crates/surrealdb/src/models.rs
  • deny.toml
  • src/rts/extractors.rs
  • src/rts/v1.rs
  • src/svc/proxy/anthropic.rs
  • src/svc/proxy/mod.rs
  • src/sys/init.rs
  • src/sys/state.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test
  • GitHub Check: Clippy
  • GitHub Check: cargo-audit
🧰 Additional context used
📓 Path-based instructions (3)
src/sys/**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

Use anyhow for internal and unexpected errors

Files:

  • src/sys/init.rs
  • src/sys/state.rs
src/**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.rs: Use the tracing macros (info!, warn!, error!, debug!) throughout the codebase for observability
Use Tokio async runtime for concurrent request handling in Axum

Files:

  • src/sys/init.rs
  • src/rts/extractors.rs
  • src/sys/state.rs
  • src/rts/v1.rs
  • src/svc/proxy/anthropic.rs
  • src/svc/proxy/mod.rs
src/rts/**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

src/rts/**/*.rs: Use AppError from crates/core for user-facing errors in Axum request handlers
Use Serde for serialization and deserialization of request/response data

Files:

  • src/rts/extractors.rs
  • src/rts/v1.rs
🪛 Checkov (3.2.510)
Dockerfile

[low] 1-84: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)

🪛 Hadolint (2.14.0)
Dockerfile

[warning] 11-11: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)


[info] 50-50: Multiple consecutive RUN instructions. Consider consolidation.

(DL3059)


[warning] 62-62: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)

🪛 Trivy (0.69.3)
Dockerfile

[info] 1-1: No HEALTHCHECK defined

Add HEALTHCHECK instruction in your Dockerfile

Rule: DS-0026

Learn more

(IaC/Dockerfile)

🔇 Additional comments (1)
CLAUDE.md (1)

1-1: LGTM! Branding update is consistent.

The rename from "ValyGate" to "ValyMux" has been applied consistently across the title, project description, and environment configuration section. The documentation remains technically accurate and grammatically correct.

Also applies to: 3-3, 36-36

Comment on lines +22 to +25
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- run: cargo install cargo-audit --locked
- run: cargo audit
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n 'cargo[- ]deny|deny\.toml' .github/workflows

Repository: CLoaKY233/Valymux

Length of output: 43


Add cargo deny check to enforce the new deny.toml policy in CI.

The audit workflow only runs cargo audit. You've added deny.toml for bans, licenses, and source-provenance checks, but nothing in your workflows executes cargo deny check. Add a step to enforce it, otherwise the policy can drift without CI catching it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/audit.yml around lines 22 - 25, Add a CI step that
installs and runs cargo-deny to enforce deny.toml: install cargo-deny (e.g., via
cargo install cargo-deny) and run cargo deny check --config deny.toml (or just
cargo deny check if config auto-detected) as a new step in the audit workflow
after the cargo audit step so the deny.toml policy (bans, licenses, provenance)
is validated in CI.

with:
components: clippy
- uses: Swatinem/rust-cache@v2
- run: cargo clippy --all-targets --all-features -- -D warnings
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find and examine the CI workflow file
find . -name "ci.yml" -path "*/.github/workflows/*" | head -5

Repository: CLoaKY233/Valymux

Length of output: 87


🏁 Script executed:

# Find Dockerfile
fd -t f "Dockerfile" | head -5

Repository: CLoaKY233/Valymux

Length of output: 71


🏁 Script executed:

# Examine the CI workflow file, focusing on lines 35-55 to see the context around lines 40 and 49-50
sed -n '35,55p' .github/workflows/ci.yml | cat -n

Repository: CLoaKY233/Valymux

Length of output: 670


🏁 Script executed:

# Examine the Dockerfile to verify if it uses --locked
cat Dockerfile | cat -n

Repository: CLoaKY233/Valymux

Length of output: 3445


🏁 Script executed:

# Get the full CI workflow to see all the cargo commands
cat -n .github/workflows/ci.yml

Repository: CLoaKY233/Valymux

Length of output: 1460


Use --locked in CI to catch stale Cargo.lock updates.

The Dockerfile builds with --locked (lines 34, 50), but the CI cargo commands do not. A PR can pass here and fail later in the Docker/release path when the committed lockfile is stale.

🔒 Suggested patch
-      - run: cargo clippy --all-targets --all-features -- -D warnings
+      - run: cargo clippy --locked --all-targets --all-features -- -D warnings
-      - run: cargo build --all-targets --all-features
-      - run: cargo test --all-features
+      - run: cargo build --locked --all-targets --all-features
+      - run: cargo test --locked --all-features
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml at line 40, CI runs invoke cargo commands (e.g.,
the step using "cargo clippy --all-targets --all-features -- -D warnings")
without --locked, which lets a stale Cargo.lock slip through; update all cargo
invocations in the workflow (at least the "cargo clippy ..." step and any other
"cargo build"/"cargo test" steps) to include the --locked flag so the CI will
fail if Cargo.lock is out of sync with Cargo.toml.

Comment on lines +70 to +76
- name: Build (native)
if: "!matrix.cross"
run: cargo build --release --target ${{ matrix.target }}

- name: Build (cross)
if: matrix.cross
run: cross build --release --target ${{ matrix.target }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the release.yml file
find . -name "release.yml" -path "*/.github/workflows/*" | head -5

Repository: CLoaKY233/Valymux

Length of output: 92


🏁 Script executed:

# Read the release.yml file to see the current state
cat .github/workflows/release.yml

Repository: CLoaKY233/Valymux

Length of output: 4058


🏁 Script executed:

# Check if Cargo.lock is committed to the repository
git ls-files | grep -E "Cargo\.lock$"

Repository: CLoaKY233/Valymux

Length of output: 71


Add --locked flag to release build commands for reproducibility.

The cargo build and cross build commands omit --locked, which means release artifacts could use dependency versions different from the committed Cargo.lock. This breaks reproducibility for production binaries. The workflow already uses --locked when installing cross (line 63), so this should be consistent across all build steps.

🔒 Suggested patch
-        run: cargo build --release --target ${{ matrix.target }}
+        run: cargo build --locked --release --target ${{ matrix.target }}
-        run: cross build --release --target ${{ matrix.target }}
+        run: cross build --locked --release --target ${{ matrix.target }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 70 - 76, The release workflow's
build steps ("Build (native)" and "Build (cross)") call `cargo build --release
--target ${ matrix.target }` and `cross build --release --target ${
matrix.target }` without the `--locked` flag; update those commands to include
`--locked` (i.e., `cargo build --locked --release --target ...` and `cross build
--locked --release --target ...`) so the builds honor the committed Cargo.lock
and ensure reproducible release artifacts.

Comment on lines +122 to +129
- name: Extract release notes from CHANGELOG
id: changelog
run: |
# Extract lines between the first and second versioned section headers
awk '/^## \[/{if(found){exit}; found=1; next} found{print}' CHANGELOG.md > release_notes.md
if [ ! -s release_notes.md ]; then
echo "No release notes found in CHANGELOG.md" > release_notes.md
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -name "CHANGELOG*" -type f 2>/dev/null | head -20

Repository: CLoaKY233/Valymux

Length of output: 43


🏁 Script executed:

# First, let's explore the repo structure
git ls-files | grep -i changelog

Repository: CLoaKY233/Valymux

Length of output: 73


🏁 Script executed:

# Also check the release.yml file to understand the context
head -150 .github/workflows/release.yml | tail -50

Repository: CLoaKY233/Valymux

Length of output: 1637


🏁 Script executed:

head -50 CHANGELOG.md

Repository: CLoaKY233/Valymux

Length of output: 1726


🏁 Script executed:

grep "^## \[" CHANGELOG.md | head -10

Repository: CLoaKY233/Valymux

Length of output: 76


🏁 Script executed:

# Let's verify the awk logic by simulating it with the actual CHANGELOG
cat CHANGELOG.md | awk '/^## \[/{if(found){exit}; found=1; next} found{print}' | head -20

Repository: CLoaKY233/Valymux

Length of output: 1027


🏁 Script executed:

# Check the workflow trigger conditions to understand when this runs
head -20 .github/workflows/release.yml

Repository: CLoaKY233/Valymux

Length of output: 381


Extract release notes for the specific tag version, not the first section.

The current awk command extracts the first ## [...] section regardless of which tag triggered the release. While CHANGELOG.md correctly starts with ## [Unreleased], the workflow relies on the developer having already updated CHANGELOG (moving Unreleased to the versioned section) before tagging. This assumption is fragile—if an old commit is tagged or the CHANGELOG update is forgotten, unreleased notes will be published for a release.

The extraction must explicitly match github.ref_name (or the version after stripping the leading v) against the CHANGELOG section header. The suggested patch addresses this by using awk with an explicit version pattern match.

📝 Suggested patch
       - name: Extract release notes from CHANGELOG
         id: changelog
         run: |
-          # Extract lines between the first and second versioned section headers
-          awk '/^## \[/{if(found){exit}; found=1; next} found{print}' CHANGELOG.md > release_notes.md
+          version="${GITHUB_REF_NAME#v}"
+          awk -v tag="$GITHUB_REF_NAME" -v version="$version" '
+            $0 ~ "^## \\[(" tag "|" version ")\\]" { found=1; next }
+            found && /^## \[/ { exit }
+            found { print }
+          ' CHANGELOG.md > release_notes.md
           if [ ! -s release_notes.md ]; then
-            echo "No release notes found in CHANGELOG.md" > release_notes.md
+            echo "No release notes found for $GITHUB_REF_NAME in CHANGELOG.md" > release_notes.md
           fi
           echo "Release notes extracted."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 122 - 129, Update the changelog
extraction step (step id changelog) to match the release header against the
actual tag by first deriving VERSION from github.ref_name (strip leading "v")
into a shell variable (e.g. VERSION=$(echo "${GITHUB_REF_NAME#v}")), then run
awk that looks specifically for "^## [VERSION]" (use the shell-expanded VERSION
in the awk pattern) to extract the section for that version, writing to
release_notes.md; keep the existing fallback that writes "No release notes found
in CHANGELOG.md" when release_notes.md is empty.

name = "valymux"
version = "0.0.1"
edition = "2024"
rust-version = "1.85"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, find and examine the Cargo.toml file
fd Cargo.toml -type f | head -5

Repository: CLoaKY233/Valymux

Length of output: 231


🏁 Script executed:

# Read the Cargo.toml to see dependencies
cat -n Cargo.toml | head -80

Repository: CLoaKY233/Valymux

Length of output: 2734


🏁 Script executed:

# Search for clippy.toml and Dockerfile references
find . -name "clippy.toml" -o -name "Dockerfile*" | head -10

Repository: CLoaKY233/Valymux

Length of output: 87


🏁 Script executed:

cat -n clippy.toml

Repository: CLoaKY233/Valymux

Length of output: 444


🏁 Script executed:

cat -n Dockerfile

Repository: CLoaKY233/Valymux

Length of output: 3445


🌐 Web query:

SurrealDB Rust SDK 3.0.4 minimum supported Rust version MSRV

💡 Result:

The minimum supported Rust version (MSRV) for SurrealDB Rust SDK 3.0.4 is Rust 1.89 or greater.

Citations:


Raise the workspace MSRV to 1.89 to match SurrealDB 3.0.4 requirements.

The workspace declares rust-version = "1.85", but the added dependency surrealdb = "3.0.4" requires Rust 1.89 or greater. This mismatch means the manifest advertises a toolchain floor that cannot satisfy the dependency graph. Update Cargo.toml, clippy.toml, and the Dockerfile base image to align on Rust 1.89.

⬆️ Required changes across configuration files

Cargo.toml (line 5):

-rust-version = "1.85"
+rust-version = "1.89"

clippy.toml (line 4):

-msrv = "1.85"
+msrv = "1.89"

Dockerfile (line 6):

-FROM rust:1.88-slim-bookworm AS deps
+FROM rust:1.89-slim-bookworm AS deps
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
rust-version = "1.85"
rust-version = "1.89"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Cargo.toml` at line 5, Update the workspace Rust minimum supported version to
1.89: change the rust-version entry named rust-version in Cargo.toml from "1.85"
to "1.89"; update the clippy configuration entry that pins the toolchain (the
rust toolchain/version entry in clippy.toml) to 1.89 as well; and update the
Dockerfile base image line that selects the Rust toolchain (the FROM or rust
image tag) to use a Rust 1.89 image so the container matches the workspace
requirement.

Comment on lines +144 to +146
if !user.enabled {
return Err(DatabaseError::InvalidConfig("account is disabled".into()));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Give request-time auth failures their own DatabaseError variants.

Disabled accounts, invalid virtual API keys, and model-allowlist failures are normal caller outcomes, not configuration bugs. Returning InvalidConfig here is what forces the Axum layers into brittle string matching or misclassified 401/500 responses. Add structured variants like Unauthorized, Forbidden, and BadRequest here, then map them once in the HTTP layer.

Also applies to: 160-164, 483-493, 506-509

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/surrealdb/src/lib.rs` around lines 144 - 146, Replace misclassified
DatabaseError::InvalidConfig usages for request-time auth/permission failures
with proper semantic variants and return those from the auth/allowlist checks:
add new DatabaseError variants (e.g., Unauthorized, Forbidden, BadRequest) to
the DatabaseError enum and use Unauthorized for invalid API keys/credentials,
Forbidden for disabled accounts or model-allowlist denials, and BadRequest for
malformed request conditions; update the checks that currently do `if
!user.enabled { return Err(DatabaseError::InvalidConfig(...)) }` and the other
occurrences mentioned (around the 160-164, 483-493, 506-509 ranges) to return
the appropriate new variant, and then map these new variants to HTTP statuses in
the HTTP/Axum error mapping layer instead of relying on string matching.

Comment on lines +171 to +174
let session = state.database.signup_user(input).await.map_err(|error| {
error!(error = %error, "signup failed");
internal_error(error)
})?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't surface expected auth failures as 500s.

signup_user() and signin_user() both have normal user-facing failure modes (constraint violations, bad credentials, disabled accounts), but both handlers currently collapse every database error into AppError::Internal. That turns common auth failures into 500 responses instead of BadRequest/Conflict or Unauthorized/Forbidden.

As per coding guidelines, src/rts/**/*.rs: Use AppError from crates/core for user-facing errors in Axum request handlers.

Also applies to: 187-190

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rts/v1.rs` around lines 171 - 174, The handler currently maps all errors
from state.database.signup_user and signin_user to internal_error, causing
expected auth failures to become 500s; change the map_err closures on
state.database.signup_user(...) and state.database.signin_user(...) to match the
database error variants and convert them to the appropriate
crates::core::AppError (e.g., BadRequest or Conflict for signup constraint
violations, Unauthorized or Forbidden for signin bad credentials/disabled
accounts) instead of always calling internal_error, ensuring the returned
AppError is propagated from the handler so Axum returns the correct 4xx
response.

Comment on lines +184 to +193
while let Some(chunk) = upstream_stream.next().await {
let chunk = match chunk {
Ok(chunk) => chunk,
Err(error) => {
yield Err(io::Error::other(error));
return;
}
};
buffer.push_str(&String::from_utf8_lossy(&chunk));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't decode each SSE chunk with from_utf8_lossy.

bytes_stream() boundaries are arbitrary. Decoding each chunk independently will corrupt any multibyte UTF-8 sequence split across chunks, so non-ASCII model output can be mangled before handle_event() sees it. Buffer raw bytes and decode only after a full event delimiter has been assembled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/svc/proxy/anthropic.rs` around lines 184 - 193, The code currently
decodes each SSE chunk with String::from_utf8_lossy as it arrives from
upstream_stream in the loop, which can corrupt multibyte UTF‑8 fragments;
instead, read and append raw bytes (e.g., push bytes into a Vec<u8> or BytesMut)
rather than converting each chunk to String, search for the SSE/event delimiter
in the raw buffer, and only call String::from_utf8_lossy or from_utf8 on the
complete event slice before passing it to handle_event(); update the loop that
consumes upstream_stream and the buffer variable handling so decoding happens
after a full event boundary is assembled.

Comment on lines +92 to +123
if canonical_request.stream {
if let Some(headers) = adapter.stream_headers() {
let stream_body = adapter
.translate_stream_response(upstream, &canonical_request, &route, &request_id)
.map_err(internal_error)?;
persist_request_log_or_warn(
state.clone(),
build_request_log(RequestLogDraft {
request_id: &request_id,
route: &route,
status_code,
latency_ms,
request_url: adapter.target_url(),
stream: true,
error_message: None,
usage: UsageSummary {
prompt_tokens: None,
completion_tokens: None,
},
}),
)
.await;

let response_status =
upstream_status_code_or_warn(status_code, adapter.target_url());
return Ok((response_status, headers, stream_body).into_response());
}

return Err(AppError::BadRequest(format!(
"{} streaming translation is not implemented yet",
adapter.name()
)));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't run streamed requests through SSE translation after upstream failure.

This branch only checks canonical_request.stream. If the upstream responds with a 4xx/5xx JSON error body, translate_stream_response() will still try to parse it as SSE and can collapse the failure into an empty [DONE] stream. Gate SSE translation on success / text/event-stream, otherwise return the normal JSON error path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/svc/proxy/mod.rs` around lines 92 - 123, The SSE translation currently
runs whenever canonical_request.stream is true, which lets
translate_stream_response try to parse upstream error bodies as SSE; fix by
gating the SSE branch: after checking canonical_request.stream and
adapter.stream_headers(), also verify the upstream returned a successful status
(e.g., status_code.is_success()) and that the headers/content-type indicate
"text/event-stream" (via adapter.stream_headers() or the returned headers)
before calling adapter.translate_stream_response; if the upstream status is
non-success or content-type is not event-stream, skip translate_stream_response
and instead follow the normal JSON/error path (persist_request_log_or_warn with
the error info and return the upstream_status_code_or_warn(...) response) rather
than collapsing the error into an empty [DONE] stream.

Comment on lines +157 to +205
fn validate_model_capabilities(
request: &CanonicalChatRequest,
route: &ResolvedProxyRoute,
) -> anyhow::Result<Option<f64>> {
const FLOAT_TOLERANCE: f64 = f64::EPSILON;

if request.stream && !route.model.supports_streaming {
anyhow::bail!("model does not support streaming");
}

if request.temperature.is_some() && !route.model.supports_temperature {
anyhow::bail!("model does not support temperature");
}

let enforced_temperature = if let Some(fixed) = route.model.temperature_fixed_to {
match request.temperature {
Some(requested) if (requested - fixed).abs() > FLOAT_TOLERANCE => {
anyhow::bail!("temperature must be {fixed} for this model");
}
_ => Some(fixed),
}
} else {
request.temperature
};

if let Some(temperature) = enforced_temperature {
if let Some(minimum) = route.model.temperature_min
&& temperature < minimum
{
anyhow::bail!("temperature must be at least {minimum} for this model");
}

if let Some(maximum) = route.model.temperature_max
&& temperature > maximum
{
anyhow::bail!("temperature must be at most {maximum} for this model");
}
}

if request.top_p.is_some() && !route.model.supports_top_p {
anyhow::bail!("model does not support top_p");
}

if request.tools.is_some() && !route.model.supports_tools {
anyhow::bail!("model does not support tools");
}

Ok(enforced_temperature)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The model catalog still doesn't gate system/thinking requests.

validate_model_capabilities() ignores supports_system_messages, supports_thinking, and thinking_required. The Anthropic adapter already turns system role messages into a system field and accepts providerOptions.anthropic.thinking, so unsupported or required combinations can still reach upstream. Validate those flags here so the catalog actually constrains requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/svc/proxy/mod.rs` around lines 157 - 205, validate_model_capabilities
currently doesn't enforce system/thinking flags; update it to reject or enforce
requests based on route.model.supports_system_messages,
route.model.supports_thinking, and route.model.thinking_required: inspect the
CanonicalChatRequest for any system-role messages (e.g., scan request.messages
for role == "system") and if any are present but
route.model.supports_system_messages is false, anyhow::bail; detect use of
"thinking" (check request for a dedicated thinking flag or provider options such
as request.provider_options.anthropic.thinking) and if thinking is present but
route.model.supports_thinking is false, anyhow::bail; if
route.model.thinking_required is true then require that thinking is present on
the request and otherwise bail; keep these checks alongside the existing
temperature/stream/tool validations in validate_model_capabilities so the
catalog properly gates system/thinking combinations.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/rts/extractors.rs`:
- Around line 21-27: The current extraction uses a case-sensitive
strip_prefix("Bearer ") and can accept whitespace-only tokens; update the logic
around auth_header/token extraction (the block that reads
parts.headers.get("Authorization") and the `let Some(token) = auth_header`
binding) to parse the Authorization header case-insensitively (e.g., split on
the first space and compare the scheme with eq_ignore_ascii_case or lowercasing)
and then trim the token and reject it if empty or all-whitespace before
proceeding; ensure you return the same early error path when the scheme is not
"bearer" (any case) or the token is empty after trimming.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7a7ed6ee-4e2c-4fed-a323-74e1b5390598

📥 Commits

Reviewing files that changed from the base of the PR and between a09bade and cce117d.

📒 Files selected for processing (2)
  • rustfmt.toml
  • src/rts/extractors.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
src/rts/**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

src/rts/**/*.rs: Use AppError from crates/core for user-facing errors in Axum request handlers
Use Serde for serialization and deserialization of request/response data

Files:

  • src/rts/extractors.rs
src/**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.rs: Use the tracing macros (info!, warn!, error!, debug!) throughout the codebase for observability
Use Tokio async runtime for concurrent request handling in Axum

Files:

  • src/rts/extractors.rs
🔇 Additional comments (3)
rustfmt.toml (2)

7-13: LGTM: Standard stable rustfmt options.

The configuration uses stable rustfmt options (max_width, use_field_init_shorthand, use_try_shorthand, force_explicit_abi, newline_style) that are well-supported and appropriate for consistent code formatting.


4-4: No action needed. The Rust 2024 edition is stable (since Rust 1.85.0, February 20, 2025) and fully supported by your MSRV and CI toolchain. The configuration is correct.

			> Likely an incorrect or invalid review comment.
src/rts/extractors.rs (1)

33-36: Good error-context preservation on auth failures.

Line 34 logs the underlying error before returning AppError::Unauthorized, which keeps diagnostics intact without leaking internals to clients.

Comment on lines +21 to +27
let auth_header = parts
.headers
.get("Authorization")
.and_then(|value| value.to_str().ok())
.and_then(|value| value.strip_prefix("Bearer "));

let Some(token) = auth_header else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Accept case-insensitive bearer scheme and reject empty tokens.

Line 25 uses a case-sensitive "Bearer " prefix match, which can reject valid bearer auth-schemes and may pass whitespace-only tokens downstream. Parse scheme case-insensitively and require a non-empty token.

Suggested fix
-        let auth_header = parts
+        let auth_header = parts
             .headers
             .get("Authorization")
             .and_then(|value| value.to_str().ok())
-            .and_then(|value| value.strip_prefix("Bearer "));
+            .and_then(|value| {
+                let mut it = value.splitn(2, ' ');
+                match (it.next(), it.next()) {
+                    (Some(scheme), Some(token))
+                        if scheme.eq_ignore_ascii_case("Bearer") && !token.trim().is_empty() =>
+                    {
+                        Some(token.trim())
+                    }
+                    _ => None,
+                }
+            });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let auth_header = parts
.headers
.get("Authorization")
.and_then(|value| value.to_str().ok())
.and_then(|value| value.strip_prefix("Bearer "));
let Some(token) = auth_header else {
let auth_header = parts
.headers
.get("Authorization")
.and_then(|value| value.to_str().ok())
.and_then(|value| {
let mut it = value.splitn(2, ' ');
match (it.next(), it.next()) {
(Some(scheme), Some(token))
if scheme.eq_ignore_ascii_case("Bearer") && !token.trim().is_empty() =>
{
Some(token.trim())
}
_ => None,
}
});
let Some(token) = auth_header else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rts/extractors.rs` around lines 21 - 27, The current extraction uses a
case-sensitive strip_prefix("Bearer ") and can accept whitespace-only tokens;
update the logic around auth_header/token extraction (the block that reads
parts.headers.get("Authorization") and the `let Some(token) = auth_header`
binding) to parse the Authorization header case-insensitively (e.g., split on
the first space and compare the scheme with eq_ignore_ascii_case or lowercasing)
and then trim the token and reject it if empty or all-whitespace before
proceeding; ensure you return the same early error path when the scheme is not
"bearer" (any case) or the token is empty after trimming.

Ignore the documents directory in .gitignore and remove legacy
docs/ARCHITECTURE.md and docs/TODO.md from the repository
@CLoaKY233 CLoaKY233 merged commit d99558c into main Mar 27, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants