Fix[bug] ONNX models generated by llm_export.py are missing some i/o by Ratheesh1104 · Pull Request #1157 · NVIDIA/Model-Optimizer

Ratheesh1104 · 2026-04-01T17:45:50Z

What does this PR do?

Fixes the missing input and output nodes in ONNX models generated by llm_export.py (#1147).
Now the following nodes are properly included:

attention_mask
position_ids
past_key_values*

This ensures ONNX models are fully compatible with downstream TensorRT workflows and standard LLM inference pipelines.

Type of change: Bug fix

Usage

from modelopt.onnx.llm_export import llm_export

# Export HuggingFace model to ONNX with all necessary inputs/outputs
llm_export(
    hf_model_path="meta-llama/Llama-3.1-8B-Instruct",
    dtype="int4_awq",
    output_dir="models/Llama-3.1-8B-Instruct-ONNX-INT4"
)

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

## Bug Fixes

* **LLM ONNX Export** – Enhanced the LLM export example to use concrete dummy tensor inputs with proper attention masks and position IDs instead of empty placeholders. Added dynamic-axes mappings for batch and sequence dimensions to improve ONNX model compatibility.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

…put nodes NVIDIA#1147

copy-pr-bot · 2026-04-01T17:45:55Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-04-01T17:48:50Z

📝 Walkthrough

Walkthrough

The main() function in examples/torch_onnx/llm_export.py now populates dummy tensor inputs with concrete values (batch_size=1, seq_len=8) for LLM export, including input_ids, attention_mask, and position_ids. Dynamic axis mappings are also defined for these inputs and the output logits.

Changes

Cohort / File(s)	Summary
LLM Export Input Setup `examples/torch_onnx/llm_export.py`	Updated `main()` to construct concrete dummy tensors (`input_ids`, `attention_mask`, `position_ids`) with fixed dimensions and populate `extra_dyn_axes` with batch/sequence dynamic axis mappings for LLM→ONNX export.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses the bug fix by referencing missing I/O in ONNX models from llm_export.py, which aligns with the PR's core objective of adding attention_mask, position_ids, and past_key_values to exported ONNX models.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns	✅ Passed	The pull request introduces only local variable assignments and tensor creation operations without violating any critical security anti-patterns outlined in SECURITY.md.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/torch_onnx/llm_export.py`:
- Around line 374-384: extra_inputs currently supplies attention_mask and
position_ids which don't exist in WrapperModelForCausalLM.forward (only
input_ids and past_key_values), causing unexpected kwargs and duplicate
input_ids; either update WrapperModelForCausalLM.forward and the
llm_to_onnx/torch_to_onnx plumbing to accept attention_mask/position_ids, or
minimally fix extra_inputs to only include input_ids (and adjust extra_dyn_axes
to remove attention_mask/position_ids) so the call that expands **extra_inputs
matches the wrapper signature and does not pass input_ids twice; update
references where extra_inputs and extra_dyn_axes are used in the
torch_to_onnx/llm_to_onnx flow accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9361d85b-0639-4da1-a683-67ddefd86246

📥 Commits

Reviewing files that changed from the base of the PR and between 09b3c0b and 99c9912.

📒 Files selected for processing (1)

examples/torch_onnx/llm_export.py

coderabbitai · 2026-04-01T17:54:10Z

examples/torch_onnx/llm_export.py

+        extra_inputs = {
+            "input_ids": dummy_input_ids,
+            "attention_mask": dummy_attention_mask,
+            "position_ids": dummy_position_ids,
+        }
+        extra_dyn_axes = {
+            "input_ids": {0: "batch", 1: "seq_len"},
+            "attention_mask": {0: "batch", 1: "seq_len"},
+            "position_ids": {0: "batch", 1: "seq_len"},
+            "logits": {0: "batch", 1: "seq_len"},
+        }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Read-only verification of signature/call-path mismatch rg -n -C3 'def forward\(self, input_ids.*past_key_values' modelopt/onnx/llm_export_utils/export_utils.py rg -n -C8 'torch_to_onnx\(' modelopt/onnx/llm_export_utils/export_utils.py rg -n -C8 'extra_inputs = \{|extra_dyn_axes = \{' examples/torch_onnx/llm_export.py

Repository: NVIDIA/Model-Optimizer

Length of output: 2743

extra_inputs breaks the export call contract and causes runtime failure.

The wrapper's forward signature only accepts input_ids and past_key_values, but extra_inputs passes input_ids, attention_mask, and position_ids as kwargs. When expanded via **extra_inputs in the torch_to_onnx call at line 134, the model's forward receives unexpected keyword arguments (attention_mask, position_ids), causing a TypeError. Additionally, input_ids would be passed twice — once positionally and once as a kwarg, which is invalid.

Suggested minimal fix

- extra_inputs = { - "input_ids": dummy_input_ids, - "attention_mask": dummy_attention_mask, - "position_ids": dummy_position_ids, - } - extra_dyn_axes = { - "input_ids": {0: "batch", 1: "seq_len"}, - "attention_mask": {0: "batch", 1: "seq_len"}, - "position_ids": {0: "batch", 1: "seq_len"}, - "logits": {0: "batch", 1: "seq_len"}, - } + extra_inputs = {} + extra_dyn_axes = {"logits": {0: "batch_size", 1: "seq_len"}}

If attention_mask and position_ids are required as ONNX inputs, coordinate updates in WrapperModelForCausalLM.forward signature and llm_to_onnx plumbing first.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/torch_onnx/llm_export.py` around lines 374 - 384, extra_inputs currently supplies attention_mask and position_ids which don't exist in WrapperModelForCausalLM.forward (only input_ids and past_key_values), causing unexpected kwargs and duplicate input_ids; either update WrapperModelForCausalLM.forward and the llm_to_onnx/torch_to_onnx plumbing to accept attention_mask/position_ids, or minimally fix extra_inputs to only include input_ids (and adjust extra_dyn_axes to remove attention_mask/position_ids) so the call that expands **extra_inputs matches the wrapper signature and does not pass input_ids twice; update references where extra_inputs and extra_dyn_axes are used in the torch_to_onnx/llm_to_onnx flow accordingly.

idruker-cerence · 2026-04-01T18:54:56Z

examples/torch_onnx/llm_export.py

+        dummy_attention_mask = torch.ones((batch_size, seq_len), dtype=torch.long)
+        dummy_position_ids = torch.arange(seq_len).unsqueeze(0).repeat(batch_size, 1)
+
+        # Correct assignment — no trailing comma


There is a non-UTF-8 character at this line

ONNX models generated by llm_export.py are missing some input and out…

99c9912

…put nodes NVIDIA#1147

Ratheesh1104 requested a review from a team as a code owner April 1, 2026 17:45

Ratheesh1104 requested a review from galagam April 1, 2026 17:45

coderabbitai bot reviewed Apr 1, 2026

View reviewed changes

idruker-cerence reviewed Apr 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix[bug] ONNX models generated by llm_export.py are missing some i/o #1157

Fix[bug] ONNX models generated by llm_export.py are missing some i/o #1157
Ratheesh1104 wants to merge 1 commit intoNVIDIA:mainfrom
Ratheesh1104:fix/onnx-missing-io-1147

Ratheesh1104 commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Apr 1, 2026

Uh oh!

coderabbitai bot commented Apr 1, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 1, 2026

Uh oh!

idruker-cerence Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Ratheesh1104 commented Apr 1, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Uh oh!

copy-pr-bot bot commented Apr 1, 2026

Uh oh!

coderabbitai bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

idruker-cerence Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Ratheesh1104 commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 1, 2026 •

edited

Loading