Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Refactors AutoRound toward a new “context + compressor + algorithm” architecture, introducing new compressors_new/ and context/ modules and updating scheme parsing/export helpers to support the new flow.
Changes:
- Added new context singletons (
ModelContext,CompressContext) and a newcompressors_newimplementation path. - Expanded scheme parsing to reconcile
bits/data_typeand support user overrides + AutoScheme integration. - Added new calibration utilities and algorithm scaffolding for quantization backends (AutoRound/RTN).
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 18 comments.
Show a summary per file
| File | Description |
|---|---|
| auto_round/utils/model.py | Avoids runtime import cycles via TYPE_CHECKING for QuantizationScheme. |
| auto_round/schemes.py | Adds scheme override + parsing helpers and bits/dtype reconciliation. |
| auto_round/formats.py | Switches divisibility checks to global supported-layer constants. |
| auto_round/context/model_context.py | Introduces model lifecycle/loading + AMP setup and forward-hook management. |
| auto_round/context/compress_context.py | Introduces device/device_map and memory-usage knobs as shared context. |
| auto_round/context/base.py | Adds simple singleton context base. |
| auto_round/context/init.py | Package init for new context module. |
| auto_round/compressors_new/utils.py | New utility module (layer config, gguf mapping, caching helpers, forward helpers). |
| auto_round/compressors_new/shard_writer.py | New shard-based saver with optional safetensors support. |
| auto_round/compressors_new/config.py | Introduces extra/legacy config dataclasses for the new compressor path. |
| auto_round/compressors_new/base.py | New “BaseCompressor” implementation wiring contexts, formats, caching, quant loop. |
| auto_round/compressors_new/init.py | Package init for compressors_new. |
| auto_round/compressors/utils.py | Extends legacy layer-config resolution to include safetensors-only tensors and skip missing modules. |
| auto_round/calibration/utils.py | Adds helpers for “early stop” caching and input reshaping for block tuning. |
| auto_round/calibration/init.py | Package init for calibration. |
| auto_round/algorithms/quantization/rtn/rtn.py | Adds placeholder RTN quantization module file. |
| auto_round/algorithms/quantization/rtn/config.py | Adds RTN algorithm config stub. |
| auto_round/algorithms/quantization/rtn/init.py | Package init for RTN quantization. |
| auto_round/algorithms/quantization/base.py | Adds base quantization class stub. |
| auto_round/algorithms/quantization/auto_round/quantize.py | Adds new AutoRound quantizer implementation (algorithm object). |
| auto_round/algorithms/quantization/auto_round/config.py | Adds new AutoRound algorithm config. |
| auto_round/algorithms/quantization/auto_round/init.py | Package init for AutoRound quantization algorithm. |
| auto_round/algorithms/quantization/init.py | Package init for quantization algorithms. |
| auto_round/algorithms/base.py | Adds base algorithm stub. |
| auto_round/algorithms/alg_config.py | Adds base algorithm config stub. |
| auto_round/algorithms/init.py | Package init for algorithms. |
Contributor
|
If there is already an algorithm folder, what is the purpose of the compressor folder? |
wenhuach21
reviewed
Mar 13, 2026
…uo/new_ar_arch
…uo/new_ar_arch
…uo/new_ar_arch
This was referenced Mar 17, 2026
Signed-off-by: n1ck-guo <heng.guo@intel.com>
for more information, see https://pre-commit.ci
wenhuach21
reviewed
Apr 1, 2026
wenhuach21
reviewed
Apr 1, 2026
| from auto_round.algorithms.rotation.hadamard.config import HadamardConfig, normalize_hadamard_config | ||
| from auto_round.algorithms.rotation.hadamard.transforms import build_hadamard_transform | ||
| from auto_round.algorithms.transforms.base import BaseRotation | ||
| from auto_round.algorithms.transforms.hadamard.config import HadamardConfig, normalize_hadamard_config |
Contributor
There was a problem hiding this comment.
I’d prefer using rotation, but it’s up to you.
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
for more information, see https://pre-commit.ci
…uo/new_ar_arch
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
…uo/new_ar_arch
Signed-off-by: n1ck-guo <heng.guo@intel.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Main entry point responsible for orchestrating the workflow, invoking different algorithms, and handling model persistence. Supports block-wise or layer-wise quantization strategies. Primary subclasses include TuneCompressor and ZeroShotCompressor.
Usage of new api:
Type of Change
Related Issues
Fixes or relates to #
Checklist Before Submitting