Skip to content

Apply async-with pattern to ValkeyClient with retry-based disconnection #10755

@jopemachine

Description

@jopemachine

Objective

Refactor ValkeyClient usage to an async with pattern that internally tracks retry counts and decides whether to tear down the connection based on consecutive failures.

Details

  • Wrap ValkeyClient operations in an async with context manager pattern
  • Track retry_count internally; when the threshold is exceeded, disconnect the client rather than continuing to retry on a broken connection
  • Migrate all ValkeyClient usage sites to the new async with pattern
  • This has a wide blast radius — all 14 domain-specific Valkey clients and their callers need updating

Implementation Reference

AgentClientPool (src/ai/backend/manager/clients/agent/pool.py) uses a similar pattern with acquire(), _record_failure(), and _record_success() that can serve as a reference for the implementation approach.

Code References

  • MonitoringValkeyClient: src/ai/backend/common/clients/valkey_client/client.py (~line 413)
  • Domain-specific clients (14 total): src/ai/backend/common/clients/valkey_client/valkey_*/client.py
  • Manager Valkey dependency: src/ai/backend/manager/dependencies/infrastructure/redis.py
  • Agent Valkey dependency: src/ai/backend/agent/dependencies/infrastructure/redis.py
  • AgentClientPool (reference only): src/ai/backend/manager/clients/agent/pool.py

Acceptance Criteria

  • ValkeyClient operations are wrapped in an async with context manager
  • Retry count is tracked internally and the connection is torn down when the threshold is exceeded
  • All ValkeyClient usage sites are migrated to the new pattern
  • Verified in both Standalone and Sentinel modes
  • Tests covering the retry-based disconnection logic are included

JIRA Issue: BA-5578

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions