Fix legacy ONNX export crash with transformers >= 5.0 by Lidang-Jiang · Pull Request #2381 · microsoft/Olive

Lidang-Jiang · 2026-04-05T08:12:59Z

Summary

Fix olive auto-opt crash when using the default (non-dynamo) ONNX export path with transformers >= 5.0.

Root cause: transformers 5.0 removed DynamicCache.from_legacy_cache() / .to_legacy_cache() and changed the cache API so that past_key_values must be a Cache object (not a list of tuples). The legacy ONNX export path passes list-format past_key_values via merge_kv_cache_to_tuple_hook, which causes AttributeError during model tracing.

Fix:

Add an early check in _convert_model_on_device: when transformers >= 5.0 and use_dynamo_exporter=False, raise a clear RuntimeError directing users to --use_model_builder --use_ort_genai (recommended) or --use_dynamo_exporter.
Guard _patch_model_if_necessary to skip on transformers >= 5.0 (the from_legacy_cache / to_legacy_cache calls are only valid for 4.45 <= transformers < 5.0).

Fixes #2335

Before (cryptic crash)

$ python -m olive auto-opt \
  --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct \
  --output_path /tmp/test/qwen-cpu-int4 \
  --device cpu --provider CPUExecutionProvider --precision int4 --log_level 1

Loading HuggingFace model from Qwen/Qwen2.5-0.5B-Instruct
[INFO] Running workflow default_workflow
[INFO] Running pass conversion:onnxconversion

...full traceback...

  File ".../transformers/masking_utils.py", line 846, in _preprocess_mask_arguments
    q_offset = past_key_values.get_seq_length()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'get_seq_length'

After (clear error message)

$ python -m olive auto-opt \
  --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct \
  --output_path /tmp/test/qwen-cpu-int4 \
  --device cpu --provider CPUExecutionProvider --precision int4 --log_level 1

RuntimeError: Legacy ONNX export (use_dynamo_exporter=False) is not compatible with
transformers 5.5.0. transformers >= 5.0 changed the DynamicCache API, which breaks
the non-dynamo export path for models with KV cache. Please use one of the following options:
  1. Add --use_model_builder --use_ort_genai to use the model builder (recommended)
  2. Add --use_dynamo_exporter to use the dynamo-based ONNX export
  3. Downgrade transformers below 5.0
Example: olive auto-opt --model_name_or_path <model> --use_model_builder --use_ort_genai ...

After (with recommended --use_model_builder flag, succeeds)

$ python -m olive auto-opt \
  --model_name_or_path Qwen/Qwen2.5-0.5B-Instruct \
  --output_path /tmp/test/qwen-cpu-int4 \
  --device cpu --provider CPUExecutionProvider \
  --use_model_builder --use_ort_genai --precision int4 --log_level 1

Loading HuggingFace model from Qwen/Qwen2.5-0.5B-Instruct
[INFO] Running workflow default_workflow
[INFO] Running pass model_builder:modelbuilder
Saving processing files in .olive-cache/.../models for GenAI
[INFO] Pass model_builder:modelbuilder finished in 32.39 seconds
[INFO] Pass extract_adapters:extractadapters finished in 0.04 seconds
[INFO] Saved output model to /tmp/test/qwen-cpu-int4
Model is saved at /tmp/test/qwen-cpu-int4

Test plan

Reproduce original crash with transformers 5.5.0 + torch 2.11.0
Verify clear error message is shown on legacy export path
Verify --use_model_builder --use_ort_genai workaround succeeds
Existing unit tests pass

transformers 5.0 removed DynamicCache.from_legacy_cache() and changed the cache API so that past_key_values must be a Cache object. The legacy (non-dynamo) ONNX export path passes list-format past_key_values which causes AttributeError during model tracing. Add an early check in _convert_model_on_device to raise a clear error message directing users to --use_model_builder or --use_dynamo_exporter. Also guard _patch_model_if_necessary to skip on transformers >= 5.0. Fixes microsoft#2335 Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix legacy ONNX export crash with transformers >= 5.0#2381

Fix legacy ONNX export crash with transformers >= 5.0#2381
Lidang-Jiang wants to merge 1 commit intomicrosoft:mainfrom
Lidang-Jiang:fix/auto-opt-transformers5

Lidang-Jiang commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lidang-Jiang commented Apr 5, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant