[NVBug: 6038899] Fix MoE export crash on meta tensors with CPU offload by cjluo-nv · Pull Request #1155 · NVIDIA/Model-Optimizer

cjluo-nv · 2026-04-01T16:54:01Z

Summary

Fixes NotImplementedError in sync_moe_gate_up_amax when quantizing MoE models (e.g. Qwen3-30B-A3B) on a single GPU with insufficient VRAM.

When GPU memory is insufficient, ModelOpt enables CPU offload via accelerate, leaving uncalibrated expert parameters on the meta device. During export, sync_moe_gate_up_amax calls torch.equal() on these meta tensors, which raises NotImplementedError because aten::equal does not support meta tensors — even though calibration itself completed successfully.

Changes

Add a guard in sync_moe_gate_up_amax to skip amax sync for meta tensors (which have no real data to sync) and emit a warning explaining the root cause.

Bug: https://nvbugspro.nvidia.com/bug/6038899

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Added warning messages for unsupported tensor configurations in quantization workflows.
- Improved edge case detection to gracefully skip processing in incompatible scenarios.

…nsors Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>

github-actions · 2026-04-01T16:58:18Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1155/
Built to branch `gh-pages` at 2026-04-01 16:58 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai · 2026-04-01T17:00:38Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5d521494-9dbb-4f17-9bf2-fe40cbf579c8

📥 Commits

Reviewing files that changed from the base of the PR and between 09b3c0b and 7b677c5.

📒 Files selected for processing (1)

modelopt/torch/export/layer_utils.py

📝 Walkthrough

Walkthrough

The sync_moe_gate_up_amax function in layer_utils.py now includes detection for meta tensors in gate and up weight quantizer amax attributes. When meta tensors are detected, the function emits a warning and skips that gate/up pair without synchronizing amax values. Existing behavior for non-meta tensors is unchanged.

Changes

Cohort / File(s)	Summary
Meta Tensor Detection `modelopt/torch/export/layer_utils.py`	Added conditional check to detect meta tensors in gate and up quantizer `amax` attributes. When detected, function emits warning message and breaks the processing loop for that pair, leaving non-meta tensor behavior unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: fixing a MoE export crash on meta tensors with CPU offload, which directly corresponds to the changeset's modification to sync_moe_gate_up_amax.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Security Anti-Patterns	✅ Passed	Pull request adds security guard for meta tensor detection in sync_moe_gate_up_amax function with no anti-pattern violations detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chenjiel/fix_6038899

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-04-01T17:06:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.21%. Comparing base (c37c74f) to head (7b677c5).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1155      +/-   ##
==========================================
+ Coverage   70.20%   70.21%   +0.01%     
==========================================
  Files         230      230              
  Lines       26098    26098              
==========================================
+ Hits        18322    18325       +3     
+ Misses       7776     7773       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Fix sync_moe_gate_up_amax crashes with NotImplementedError on meta te…

7b677c5

…nsors Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>

cjluo-nv requested a review from a team as a code owner April 1, 2026 16:54

cjluo-nv requested a review from jingyu-ml April 1, 2026 16:54

cjluo-nv changed the title ~~Fix sync_moe_gate_up_amax crash on meta tensors with CPU offload~~ [NVBug: 6038899] Fix MoE export crash on meta tensors with CPU offload Apr 1, 2026

cjluo-nv requested a review from meenchen April 1, 2026 16:56

kevalmorabia97 added the cherry-pick After code freeze, cherry-pick into release branch for next rc. Only for bug fixes and doc updates label Apr 1, 2026

meenchen approved these changes Apr 1, 2026

View reviewed changes

cjluo-nv enabled auto-merge (squash) April 1, 2026 20:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NVBug: 6038899] Fix MoE export crash on meta tensors with CPU offload#1155

[NVBug: 6038899] Fix MoE export crash on meta tensors with CPU offload#1155
cjluo-nv wants to merge 1 commit intomainfrom
chenjiel/fix_6038899

cjluo-nv commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Apr 1, 2026

Built to branch `gh-pages` at 2026-04-01 16:58 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot commented Apr 1, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov bot commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cjluo-nv commented Apr 1, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Summary by CodeRabbit

Uh oh!

github-actions bot commented Apr 1, 2026

Built to branch gh-pages at 2026-04-01 16:58 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

codecov bot commented Apr 1, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cjluo-nv commented Apr 1, 2026 •

edited by coderabbitai bot

Loading

Built to branch `gh-pages` at 2026-04-01 16:58 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai bot commented Apr 1, 2026 •

edited

Loading