Skip to content

support WOQ model input, such as kimi2.5#1642

Open
xin3he wants to merge 12 commits intomainfrom
xinhe/3-31a
Open

support WOQ model input, such as kimi2.5#1642
xin3he wants to merge 12 commits intomainfrom
xinhe/3-31a

Conversation

@xin3he
Copy link
Copy Markdown
Contributor

@xin3he xin3he commented Mar 31, 2026

Description

Please briefly describe your main changes, the motivation.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please specify):

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.

Signed-off-by: Xin He <xin3.he@intel.com>
Copilot AI review requested due to automatic review settings March 31, 2026 03:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial support for WOQ (weight-only quantization) model inputs by extending the weight-type detection/conversion framework and adding a CPU test that exercises loading and converting a WOQ model to high precision.

Changes:

  • Introduces ModuleWeightType.WOQ and registers a new WOQHandler for detection/conversion in auto_round/utils/weight_handler.py.
  • Adds a new CPU test (test_w4a16) that loads a WOQ model and asserts WOQ detection + conversion behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
auto_round/utils/weight_handler.py Adds WOQ enum value and a new WOQHandler intended to detect/convert WOQ quantized layers.
test/test_cpu/advanced/test_low_precision_input_model.py Adds a new test case for a WOQ (w4a16) model path and validates detection/conversion.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@xin3he xin3he marked this pull request as draft April 2, 2026 03:00
xin3he and others added 9 commits April 2, 2026 05:12
Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
…h failure (#1621)

Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…hemes in utils and tests (#1643)

Signed-off-by: Xin He <xin3.he@intel.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Xin He <xin3.he@intel.com>
Signed-off-by: Xin He <xin3.he@intel.com>
… during module conversion

Signed-off-by: Xin He <xin3.he@intel.com>
@xin3he xin3he marked this pull request as ready for review April 7, 2026 05:45
@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 7, 2026

/azp run Unit-Test-CUDA-AutoRound

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xin3he xin3he requested review from wenhuach21 and yiliu30 April 7, 2026 05:52
@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 9, 2026

Will update to fix CI after this PR is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants