Add lazy-loading for extras packages by sjmonson · Pull Request #641 · vllm-project/guidellm

sjmonson · 2026-03-17T21:56:09Z

TODO

Pending upstream change: Support lazy importing submodules in attach scientific-python/lazy-loader#168
Support for soft extras (E.g. perf group)
Lazy load in __main__.py to improve CLI responsiveness

Summary

Lazy loads extras submodules in order to defer import errors to the time of use.

Details

TODO

Test Plan

Without vLLM:

Run guidellm benchmark run --help and observe no errors
Run uv run guidellm benchmark run --backend vllm_python --model test and observe error with helpful message

With vLLM:

Run guidellm benchmark run --help and observe no errors
Run uv run guidellm benchmark run --backend vllm_python --model test ... and observe successful benchmark
Run tox -re test-e2e and observe that tests pass (Previously they would fail when vLLM was installed due to load times)

Related Issues

Replaces Add helper to defer ImportError on extras and support lazy loading #636

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

mergify · 2026-03-17T22:25:55Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @sjmonson.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Samuel Monson <smonson@redhat.com>

`torch` seems to break when it encounters lazy ImportErrors Signed-off-by: Samuel Monson <smonson@redhat.com>

## Summary Clean up some `__init__` package code and move uvloop config to `__init__`. ## Details This clean up was originally a part of #641 but as that PR is blocked I decided to split it out. Removing the transformers logging config does not seem to have any real affect; I do not get logs either way. Importing any huggingface libraries incurs a significant time cost so this is a prereq to improving CLI responsiveness. Additionally uvloop should be configured as early as possible so moved the setup to `__init__`. --- - [x] "I certify that all code in this PR is my own, except as noted below." ## Use of AI - [ ] Includes AI-assisted code completion - [ ] Includes code generated by an AI application - [ ] Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes `## WRITTEN BY AI ##`)

sjmonson · 2026-03-20T19:17:29Z

This PR is currently blocked on "Lazy load in __main__.py" as our click definitions depend on heavy imports. CLI refactoring will be the main focus of v0.7.0.

## Summary Fixes spawn and forkserver multi-process contexts. ## Details I was hoping that after #647 we could switch to `forkserver` by default. However it turns out that `forkserver` and `spawn` will import the calling processes entrypoint (E.g. `__main__.py`) so we run into the same blocker as #641. However, I was able to confirm that striping every heavy import out of `__main__.py` solves the issue. So we should be good to switch in v0.7.0. On my machine there is about a ~10s overhead for `forkserver` and slightly more for `spawn`, which is not the worst for a default. However, the overhead may be more on other systems: ### `time guidellm benchmark run --profile poisson --rate 5 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json` | Context | real | user | sys | | ---------- | --------- | --------- | -------- | | Fork | 0m37.874s | 0m17.356s | 0m1.883s | | Forkserver | 0m47.344s | 0m14.862s | 0m0.860s | | Spawn | 0m49.515s | 1m51.230s | 0m8.915s | ### `time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 30 --outputs json` | Context | real | user | sys | | ---------- | --------- | --------- | --------- | | Fork | 0m39.324s | 0m37.602s | 0m5.623s | | Forkserver | 0m49.609s | 0m19.710s | 0m1.311s | | Spawn | 0m50.399s | 2m9.724s | 0m11.374s | ### `time guidellm benchmark run --profile concurrent --rate 400 --data prompt_tokens=128,output_tokens=128 --max-seconds 120 --outputs json` | Context | real | user | sys | | ---------- | --------- | --------- | --------- | | Fork | 2m15.309s | 1m42.911s | 0m15.957s | | Forkserver | 2m25.964s | 0m38.891s | 0m2.802s | | Spawn | 2m27.454s | 3m24.325s | 0m22.531s | ## Test Plan Set `GUIDELLM__MP_CONTEXT_TYPE=forkserver` and confirm benchmarks run. --- - [x] "I certify that all code in this PR is my own, except as noted below." ## Use of AI - [x] Includes AI-assisted code completion - [ ] Includes code generated by an AI application - [ ] Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes `## WRITTEN BY AI ##`)

mergify bot added the needs-rebase label Mar 17, 2026

sjmonson force-pushed the feat/lazy_loader branch from e1e9203 to 5e4ee86 Compare March 18, 2026 18:39

sjmonson changed the base branch from main to fix/split_utils March 18, 2026 18:40

Base automatically changed from fix/split_utils to main March 18, 2026 18:41

mergify bot removed the needs-rebase label Mar 18, 2026

sjmonson mentioned this pull request Mar 19, 2026

Cleanup __init__ #647

Merged

4 tasks

sjmonson added 4 commits March 19, 2026 14:21

Add lazy-loader

b11b362

Signed-off-by: Samuel Monson <smonson@redhat.com>

Switch to lazy-load fork

a1e0f50

Signed-off-by: Samuel Monson <smonson@redhat.com>

Switch to lazy-loading for extras packages

729deb8

Signed-off-by: Samuel Monson <smonson@redhat.com>

Use AttributeError for failed optionals

8156271

`torch` seems to break when it encounters lazy ImportErrors Signed-off-by: Samuel Monson <smonson@redhat.com>

sjmonson force-pushed the feat/lazy_loader branch from 876cc30 to 8156271 Compare March 19, 2026 18:21

sjmonson changed the base branch from main to fix/init_cleanup March 19, 2026 18:21

Base automatically changed from fix/init_cleanup to main March 19, 2026 20:00

sjmonson mentioned this pull request Mar 20, 2026

Pass mp context to strategy #651

Merged

4 tasks

dbutenhof added this to the v0.7.0 milestone Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add lazy-loading for extras packages#641

Add lazy-loading for extras packages#641
sjmonson wants to merge 4 commits intomainfrom
feat/lazy_loader

sjmonson commented Mar 17, 2026

Uh oh!

mergify bot commented Mar 17, 2026

Uh oh!

sjmonson commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sjmonson commented Mar 17, 2026

TODO

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

mergify bot commented Mar 17, 2026

Uh oh!

sjmonson commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants