Skip to content

Added 'vllm_async' engine#31

Open
Sirorezka wants to merge 8 commits intoWildEval:mainfrom
Sirorezka:feat_vllm_async
Open

Added 'vllm_async' engine#31
Sirorezka wants to merge 8 commits intoWildEval:mainfrom
Sirorezka:feat_vllm_async

Conversation

@Sirorezka
Copy link
Copy Markdown

Additionally to launching several instances in shards you can run async_vllm engine:

  • Less verbosity compared to running multiple python instances with shards;
  • Queue control is given to vllm engine, vllm decides how many instances will be in batches and how many will be queued;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants