fix: TRT-LLM multimodal preprocessor - remove default_multimodal_input_loader from the embedding paths by moraxu · Pull Request #6924 · ai-dynamo/dynamo

moraxu · 2026-03-05T04:57:24Z

Overview:

A follow up to #6840 to remove default_multimodal_input_loader calls from the embedding paths and instead pass txt prompt from the Rust frontend (where the chat template has already been applied to it) to TRT-LLM.

Details:

Where should the reviewer start?

Tested

llava-v1.6-mistral-7b-hf:
- E/P/D: embeddings & image URL
- E/PD: image URL only, embeddings don't work
- Aggregated: embeddings & image URL & text
Qwen3-VL-2B-Instruct:
- E/P/D: image URL

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Improves on GitHub issue: fix: TRT-LLM multimodal preprocessor - revert to the old default_multimodal_input_loader for the embeddings case #6840

Summary by CodeRabbit

New Features
- Support for pre-computed embeddings in multimodal requests, including explicit handling when a formatted prompt is required.
- Support for propagating a formatted prompt through multimodal request flow.
Refactor
- Unified multimodal data handling across request types with improved embedding/image loading, stricter validation, and clearer warnings/logging.

copy-pr-bot · 2026-03-05T04:57:28Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-03-05T04:57:33Z

👋 Hi moraxu! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2026-03-05T05:08:20Z

Walkthrough

Adds optional formatted_prompt propagation and explicit embedding handling across Python and Rust preprocessing: Python multimodal paths now load/attach embeddings (structured under multi_modal_embeddings) and require formatted_prompt for embeddings; Rust gather_multi_modal_data gains a formatted_prompt parameter passed through preprocess flows.

Changes

Cohort / File(s)	Summary
Multimodal Processor (Python) `components/src/dynamo/trtllm/multimodal_processor.py`	Removed early token_id extraction in PD/EPD paths; added handling for `extra_args.formatted_prompt`; EPD-NIXL embeddings are structured as `{"image": [...]}`; PD flow now separates image URLs vs embedding files (`.pt/.pth/.bin`), loads embedding tensors, attaches `formatted_prompt` when present, warns/returns None if required prompt missing; added stricter size/access checks and more verbose logging.
Preprocessor (Rust) `lib/llm/src/preprocessor.rs`	Added `formatted_prompt: Option<String>` parameter to `gather_multi_modal_data` and threaded it through `gather_tokens`, `preprocess_request`, and related call sites; `extra_args` may now include `formatted_prompt` when multimodal data is present; preserved prior behavior when `formatted_prompt` is None.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I hop through tensors, soft and light,
I tuck prompts where vectors take flight,
From Python burrow to Rusty glen,
Embeddings hum — we sing again,
A tiny rabbit, coding delight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	PR description lacks required details. Overview references PR `#6840` but Details section is empty; Where should reviewer start is missing; specific files to review are not called out.	Complete the Details section with specific changes made, and add a 'Where should the reviewer start' section naming critical files (components/src/dynamo/trtllm/multimodal_processor.py and lib/llm/src/preprocessor.rs) for focused review.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: TRT-LLM multimodal preprocessor - remove default_multimodal_input_loader from the embedding paths' is specific and directly reflects the main change: removing default_multimodal_input_loader calls from embedding paths and passing formatted prompts instead.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

lib/llm/src/preprocessor.rs

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/src/dynamo/trtllm/multimodal_processor.py`:
- Line 326: The current logging line logs the entire processed_inputs (including
raw prompt text and tensors) which can leak sensitive prompts and create huge
logs; change the logging in the block that emits logging.info(f"Processed
inputs: {processed_inputs}") to instead log a sanitized summary: redact or omit
the prompt text and only log safe metadata such as tensor shapes/dtypes and
masked prompt length (or a boolean indicating presence of prompt), or call a
sanitizer function (e.g., sanitize_processed_inputs) before logging; ensure you
update the logging level to debug if needed and retain enough info for debugging
without printing full prompt content or tensor data.

In `@lib/llm/src/preprocessor.rs`:
- Around line 1278-1281: Formatting is failing around the call to
self.gather_multi_modal_data in preprocessor.rs; run rustfmt (cargo fmt) to
format the file and commit the updated formatting so the call site and
surrounding block conform to rustfmt rules (e.g., adjust spacing/indentation
around the builder invocation and the await? operator). Ensure the formatted
changes that include the gather_multi_modal_data(&request, &mut builder,
None).await? line are committed.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 486b9c66-0b40-4d2e-bca7-fa7976520e2b

📥 Commits

Reviewing files that changed from the base of the PR and between 869e733 and 61ff322.

📒 Files selected for processing (2)

components/src/dynamo/trtllm/multimodal_processor.py
lib/llm/src/preprocessor.rs

components/src/dynamo/trtllm/multimodal_processor.py

lib/llm/src/preprocessor.rs

Signed-off-by: Michal Guzek <mguzek@nvidia.com>

indrajit96

@rmccorm4
@KrishnanPrash
@grahamking
For rust side changes

@moraxu worker changes LGTM except minor changes
Ran my local tests ALL PASS

components/src/dynamo/trtllm/multimodal_processor.py

Signed-off-by: Michal Guzek <mguzek@nvidia.com>

indrajit96

LGTM

indrajit96 · 2026-03-05T23:08:50Z

/ok to test a8e8d25

…t_loader from the embedding paths (#6924) Signed-off-by: Michal Guzek <mguzek@nvidia.com> Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

…loader (#6924) (#6993) Signed-off-by: Michal Guzek <mguzek@nvidia.com> Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com> Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>

moraxu requested review from a team as code owners March 5, 2026 04:57

pull-request-size bot added the size/L label Mar 5, 2026

github-actions bot added fix external-contribution Pull request is from an external contributor backend::trtllm Relates to the trtllm backend frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` multimodal labels Mar 5, 2026

moraxu force-pushed the trtllm-multimodal-processor-fix-the-path-for-embeddings branch from 62d4bbb to 61ff322 Compare March 5, 2026 05:03

pull-request-size bot added size/M and removed size/L labels Mar 5, 2026

moraxu commented Mar 5, 2026

View reviewed changes

lib/llm/src/preprocessor.rs Show resolved Hide resolved

moraxu requested review from 2ez4bz and indrajit96 March 5, 2026 05:14

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

components/src/dynamo/trtllm/multimodal_processor.py Show resolved Hide resolved

components/src/dynamo/trtllm/multimodal_processor.py Outdated Show resolved Hide resolved

lib/llm/src/preprocessor.rs Outdated Show resolved Hide resolved

First draft

3ec32fa

Signed-off-by: Michal Guzek <mguzek@nvidia.com>

moraxu force-pushed the trtllm-multimodal-processor-fix-the-path-for-embeddings branch from 61ff322 to 3ec32fa Compare March 5, 2026 17:21

pull-request-size bot added size/L and removed size/M labels Mar 5, 2026

Rebase fixes

13c9d91

Signed-off-by: Michal Guzek <mguzek@nvidia.com>

indrajit96 reviewed Mar 5, 2026

View reviewed changes

components/src/dynamo/trtllm/multimodal_processor.py Outdated Show resolved Hide resolved

indrajit96 requested review from KrishnanPrash, grahamking and rmccorm4 March 5, 2026 19:51

grahamking approved these changes Mar 5, 2026

View reviewed changes

Address reviews and linter

a8e8d25

Signed-off-by: Michal Guzek <mguzek@nvidia.com>

moraxu requested a review from indrajit96 March 5, 2026 20:50

moraxu enabled auto-merge (squash) March 5, 2026 21:45

indrajit96 approved these changes Mar 5, 2026

View reviewed changes

copy-pr-bot bot had a problem deploying to GITLAB March 5, 2026 23:13 Failure

moraxu merged commit e6ddf0e into ai-dynamo:main Mar 6, 2026
142 of 147 checks passed

indrajit96 mentioned this pull request Mar 6, 2026

fix: TRT-LLM multimodal preprocessor remove default_multimodal_input_loader (#6924) #6993

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: TRT-LLM multimodal preprocessor - remove default_multimodal_input_loader from the embedding paths#6924

fix: TRT-LLM multimodal preprocessor - remove default_multimodal_input_loader from the embedding paths#6924
moraxu merged 3 commits intoai-dynamo:mainfrom
moraxu:trtllm-multimodal-processor-fix-the-path-for-embeddings

moraxu commented Mar 5, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

indrajit96 left a comment

Uh oh!

Uh oh!

indrajit96 left a comment

Uh oh!

indrajit96 commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

moraxu commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Tested

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

indrajit96 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

indrajit96 left a comment

Choose a reason for hiding this comment

Uh oh!

indrajit96 commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

moraxu commented Mar 5, 2026 •

edited

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading