Skip to content

[Bug]: offload/immediate_saving issue #1659

@wenhuach21

Description

@wenhuach21

Problem Description

with pr #1656

/models/gemma-4-26B-A4B-it --layer_config '{"model.language_model.layers.\d+.self_attn..":{"bits":"8"},"model.language_model.layers.\d+.mlp..":{"bits":"8"},"model.language_model.layers.\d+.router..*":{"bits":"8"}}' --iters 0 --output_dir "/data2/wenhuach/mixed" --scheme W4A16

with --disable_opt_rtn is fine
/models/gemma-4-26B-A4B-it --layer_config '{"model.language_model.layers.\d+.self_attn..":{"bits":"8"},"model.language_model.layers.\d+.mlp..":{"bits":"8"},"model.language_model.layers.\d+.router..*":{"bits":"8"}}' --iters 0 --output_dir "/data2/wenhuach/mixed" --scheme W4A16 --disable_opt_rtn

Traceback (most recent call last):
File "/home/wenhuach/auto-round/auto_round/main.py", line 838, in
run()
~~~^^
File "/home/wenhuach/auto-round/auto_round/main.py", line 822, in run
start()
~~~~~^^
File "/home/wenhuach/auto-round/auto_round/main.py", line 541, in start
tune(args)
~~~~^^^^^^
File "/home/wenhuach/auto-round/auto_round/main.py", line 761, in tune
model, folders = autoround.quantize_and_save(export_dir, format=args.format) # pylint: disable=E1101
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wenhuach/auto-round/auto_round/compressors/base.py", line 1026, in quantize_and_save
model, _ = self.quantize()
~~~~~~~~~~~~~^^
File "/home/wenhuach/auto-round/auto_round/compressors/base.py", line 1816, in quantize
return self._quantize_rtn()
~~~~~~~~~~~~~~~~~~^^
File "/home/wenhuach/miniforge3/envs/autoround/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
File "/home/wenhuach/auto-round/auto_round/compressors/base.py", line 1511, in _quantize_rtn
shard_writer(self, is_finalize=True)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wenhuach/miniforge3/envs/autoround/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
File "/home/wenhuach/auto-round/auto_round/compressors/shard_writer.py", line 267, in shard_writer
rounder._shard_writer.finalize()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/wenhuach/auto-round/auto_round/compressors/shard_writer.py", line 207, in finalize
self._add_tensor(pname, tensor.detach().to("cpu"))
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/wenhuach/auto-round/auto_round/compressors/shard_writer.py", line 127, in _add_tensor
self._flush_shard()
~~~~~~~~~~~~~~~~~^^
File "/home/wenhuach/auto-round/auto_round/compressors/shard_writer.py", line 159, in _flush_shard
self._offload_to_meta(saved_params)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/home/wenhuach/auto-round/auto_round/compressors/shard_writer.py", line 176, in _offload_to_meta
module.to("meta")
~~~~~~~~~^^^^^^^^
File "/home/wenhuach/miniforge3/envs/autoround/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1381, in to
return self._apply(convert)
~~~~~~~~~~~^^^^^^^^^
File "/home/wenhuach/miniforge3/envs/autoround/lib/python3.13/site-packages/torch/nn/modules/module.py", line 933, in _apply
module._apply(fn)
~~~~~~~~~~~~~^^^^
File "/home/wenhuach/miniforge3/envs/autoround/lib/python3.13/site-packages/torch/nn/modules/module.py", line 999, in _apply
assert isinstance(param, Parameter)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^
AssertionError

Reproduction Steps

~

Environment Information

No response

Error Logs

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions