fix hadamard transform weight dtype, using float32 as default and in-place transformed weight . by lkk12014402 · Pull Request #1665 · intel/auto-round

lkk12014402 · 2026-04-07T09:06:57Z

Description

using hadamard transform float64 weight as default.

Copilot

Pull request overview

This PR changes the default Hadamard transform weight dtype to torch.float64 and adjusts the activation/weight transform application paths to align input/output dtypes with the Hadamard matrix, restoring original input dtypes after the transform.

Changes:

Default Hadamard transform precision to torch.float64 and document the behavior.
Remove explicit precision=module.dtype / precision=module.weight.dtype when building transforms, relying on the new default.
Add dtype-casting logic in both the Triton (mxfp4_forward_kernel_wrapper) and non-Triton hook paths, and cast outputs back to the original input dtype.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`auto_round/experimental/transform/triton/mxfp4.py`	Casts `x` to the Hadamard matrix dtype before launching the Triton kernel.
`auto_round/experimental/transform/hadamards.py`	Sets default Hadamard weight precision to `float64` and adds an expanded docstring.
`auto_round/experimental/transform/apply.py`	Stops passing module dtype into transform construction; restores original activation dtype after applying transforms.

auto_round/experimental/transform/hadamards.py

auto_round/experimental/transform/apply.py

auto_round/experimental/transform/triton/mxfp4.py

Signed-off-by: lkk12014402 <kaokao.lv@intel.com>

for more information, see https://pre-commit.ci

wenhuach21 · 2026-04-07T09:13:00Z

I was mistaken. It makes sense to use higher precision for offline transformations; however, for online transformations, using torch.float64 would be significantly more costly I guess.

lkk12014402 · 2026-04-07T10:51:29Z

I was mistaken. It makes sense to use higher precision for offline transformations; however, for online transformations, using torch.float64 would be significantly more costly I guess.

There is no obvious performance degradation. Using float64 is about 10% slower than bfloat16, and after replacing the dtype with float32, it is about 1~2% slower than bfloat16. (I get the data by testing piqa task with lm_eval, and the backend is hf)

Signed-off-by: lkk12014402 <kaokao.lv@intel.com>

for more information, see https://pre-commit.ci

lkk12014402 requested review from Copilot, wenhuach21 and yiliu30 April 7, 2026 09:06

Copilot started reviewing on behalf of lkk12014402 April 7, 2026 09:07 View session

lkk12014402 mentioned this pull request Apr 7, 2026

[Feature] Enhance hadamard transform #1569

Open

Copilot AI reviewed Apr 7, 2026

View reviewed changes

auto_round/experimental/transform/hadamards.py Show resolved Hide resolved

auto_round/experimental/transform/apply.py Show resolved Hide resolved

auto_round/experimental/transform/apply.py Show resolved Hide resolved

auto_round/experimental/transform/triton/mxfp4.py Show resolved Hide resolved

lkk12014402 and others added 2 commits April 7, 2026 09:11

fix hadamard transform weight dtype, using float64 as default.

74d4f4e

Signed-off-by: lkk12014402 <kaokao.lv@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

aa06e43

for more information, see https://pre-commit.ci

float32 maybe enough for hadamard transform.

928b155

Signed-off-by: lkk12014402 <kaokao.lv@intel.com>

wenhuach21 approved these changes Apr 8, 2026

View reviewed changes

lkk12014402 changed the title ~~fix hadamard transform weight dtype, using float64 as default.~~ fix hadamard transform weight dtype, using float32 as default and in-place transformed weight . Apr 8, 2026

lkk12014402 and others added 2 commits April 8, 2026 08:56

in-place weight when auto-round tuning.

c67b95d

Signed-off-by: lkk12014402 <kaokao.lv@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

4700eb2

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix hadamard transform weight dtype, using float32 as default and in-place transformed weight .#1665

fix hadamard transform weight dtype, using float32 as default and in-place transformed weight .#1665
lkk12014402 wants to merge 5 commits intointel:mainfrom
lkk12014402:fix_hadamard_transform_dtype

lkk12014402 commented Apr 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenhuach21 commented Apr 7, 2026

Uh oh!

lkk12014402 commented Apr 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lkk12014402 commented Apr 7, 2026

Description

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenhuach21 commented Apr 7, 2026

Uh oh!

lkk12014402 commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lkk12014402 commented Apr 7, 2026 •

edited

Loading