Skip to content

Add vLLM cache to TTS model #4

Merged
PatrickCmd merged 2 commits intodeployfrom
vllm-cache
Feb 3, 2026
Merged

Add vLLM cache to TTS model #4
PatrickCmd merged 2 commits intodeployfrom
vllm-cache

Conversation

@huwenjie333
Copy link
Copy Markdown
Collaborator

This PR adds the vLLM cache to the persist storage in the deployment at /root/.cache/vllm. It reduces cold start time from 1.5 mins to 1 min.

Screenshot 2026-01-09 at 2 16 52 PM

@huwenjie333 huwenjie333 requested a review from jqug January 9, 2026 11:26
@huwenjie333 huwenjie333 assigned PatrickCmd and unassigned PatrickCmd Jan 9, 2026
@huwenjie333 huwenjie333 requested a review from PatrickCmd January 9, 2026 11:26
@PatrickCmd PatrickCmd merged commit 1d10d47 into deploy Feb 3, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants