Add vLLM cache to TTS model by huwenjie333 · Pull Request #4 · SunbirdAI/worker-vllm

huwenjie333 · 2026-01-09T11:26:27Z

This PR adds the vLLM cache to the persist storage in the deployment at /root/.cache/vllm. It reduces cold start time from 1.5 mins to 1 min.

update

43aa1df

huwenjie333 requested a review from jqug January 9, 2026 11:26

huwenjie333 assigned PatrickCmd and unassigned PatrickCmd Jan 9, 2026

huwenjie333 requested a review from PatrickCmd January 9, 2026 11:26

comments

8ae1ac7

PatrickCmd merged commit 1d10d47 into deploy Feb 3, 2026
1 check failed

Provide feedback