Feature Description
Currently export format from doesn't support llm_compressor (compressed_tensor) if we are using INT4 (W4A16) quantization scheme.
Motivation and Use Case
Would be good to be able to export into llm_compressor format using INT4 (W4A16) scheme. We can already do that through llm_compressor + AutoRoundModifier. But now if I use autoround library directly with w4a16 scheme + packing-format llm_compressor, it's not working.
Alternatives Considered
No response
Definition of Done
No response
Additional Context
No response
Feature Description
Currently export format from doesn't support llm_compressor (compressed_tensor) if we are using INT4 (W4A16) quantization scheme.
Motivation and Use Case
Would be good to be able to export into llm_compressor format using INT4 (W4A16) scheme. We can already do that through llm_compressor + AutoRoundModifier. But now if I use autoround library directly with w4a16 scheme + packing-format llm_compressor, it's not working.
Alternatives Considered
No response
Definition of Done
No response
Additional Context
No response