-
Notifications
You must be signed in to change notification settings - Fork 288
qwen3-vl does not work. #2376
Copy link
Copy link
Open
Description
Describe the bug
I tried auto-opt for Qwen3-VL with the commit applied at #2345 but failed.
To Reproduce
% pip install 'olive-ai[auto-opt] @ git+https://github.com/microsoft/Olive.git@6c1a86971c6b5a9df513410979dc67984c992397'
% python -m olive auto-opt --model_name_or_path Qwen/Qwen3-VL-Embedding-2B --trust_remote_code --output_path models/Qwen3-VL-Embedding-2B-int8-webgpu --device gpu --provider WebGpuExecutionProvider --use_ort_genai --precision int8 --log_level 1
Loading HuggingFace model from Qwen/Qwen3-VL-Embedding-2B
[2026-03-31 11:22:31,736] [INFO] [run.py:99:run_engine] Running workflow default_workflow
[2026-03-31 11:22:31,770] [INFO] [cache.py:142:__init__] Using cache directory: <DIR>/.olive-cache/default_workflow
[2026-03-31 11:22:31,771] [INFO] [accelerator_creator.py:200:create_accelerator] Running workflow on accelerator spec: gpu-webgpu
[2026-03-31 11:22:31,781] [INFO] [engine.py:208:run] Running Olive on accelerator: gpu-webgpu
[2026-03-31 11:22:31,781] [INFO] [engine.py:836:_create_system] Creating target system ...
[2026-03-31 11:22:31,781] [INFO] [engine.py:839:_create_system] Target system created in 0.000029 seconds
[2026-03-31 11:22:31,781] [INFO] [engine.py:842:_create_system] Creating host system ...
[2026-03-31 11:22:31,781] [INFO] [engine.py:845:_create_system] Host system created in 0.000018 seconds
[2026-03-31 11:22:32,048] [INFO] [engine.py:668:_run_pass] Running pass conversion:onnxconversion
`torch_dtype` is deprecated! Use `dtype` instead!
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "<DIR>/.venv/lib/python3.13/site-packages/olive/__main__.py", line 11, in <module>
main(called_as_console_script=False)
~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/cli/launcher.py", line 75, in main
service.run()
~~~~~~~~~~~^^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/telemetry/telemetry_extensions.py", line 137, in wrapper
return func(*args, **kwargs)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/cli/auto_opt.py", line 177, in run
return self._run_workflow()
~~~~~~~~~~~~~~~~~~^^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/cli/base.py", line 45, in _run_workflow
workflow_output = olive_run(run_config)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/workflows/run/run.py", line 178, in run
return run_engine(package_config, run_config)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/workflows/run/run.py", line 139, in run_engine
return engine.run(
~~~~~~~~~~^
run_config.input_model,
^^^^^^^^^^^^^^^^^^^^^^^
...<5 lines>...
run_config.engine.log_severity_level,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/telemetry/telemetry_extensions.py", line 137, in wrapper
return func(*args, **kwargs)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 210, in run
self.run_accelerator(
~~~~~~~~~~~~~~~~~~~~^
input_model_config,
^^^^^^^^^^^^^^^^^^^
...<2 lines>...
accelerator_spec,
^^^^^^^^^^^^^^^^^
)
^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 287, in run_accelerator
self._run_no_search(input_model_config, input_model_id, accelerator_spec, artifacts_dir)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 331, in _run_no_search
should_prune, signal, model_ids = self._run_passes(input_model_config, input_model_id, accelerator_spec)
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 624, in _run_passes
model_config, model_id = self._run_pass(
~~~~~~~~~~~~~~^
pass_name,
^^^^^^^^^^
...<2 lines>...
accelerator_spec,
^^^^^^^^^^^^^^^^^
)
^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 719, in _run_pass
output_model_config = host.run_pass(p, input_model_config, output_model_path)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/systems/local.py", line 45, in run_pass
output_model = the_pass.run(model, output_model_path)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/olive_pass.py", line 243, in run
output_model = self._run_for_config(model, self.config, output_model_path)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/onnx/conversion.py", line 652, in _run_for_config
output_model = self._run_for_config_internal(model, config, output_model_path)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/onnx/conversion.py", line 695, in _run_for_config_internal
return self._convert_model_on_device(model, config, output_model_path, device, torch_dtype)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/onnx/conversion.py", line 713, in _convert_model_on_device
pytorch_model = model.load_model(cache_model=False)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/model/handler/hf.py", line 75, in load_model
model = load_model_from_task(self.task, self.model_path, **self.get_load_kwargs())
File "<DIR>/.venv/lib/python3.13/site-packages/olive/common/hf/utils.py", line 62, in load_model_from_task
model = from_pretrained(model_class, model_name_or_path, "model", **kwargs)
File "<DIR>/.venv/lib/python3.13/site-packages/olive/common/hf/utils.py", line 94, in from_pretrained
return cls.from_pretrained(get_pretrained_name_or_path(model_name_or_path, mlflow_dir), **kwargs)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<DIR>/.venv/lib/python3.13/site-packages/transformers/models/auto/auto_factory.py", line 384, in from_pretrained
raise ValueError(
...<2 lines>...
)
ValueError: Unrecognized configuration class <class 'transformers.models.qwen3_vl.configuration_qwen3_vl.Qwen3VLConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of AfmoeConfig, ApertusConfig, ArceeConfig, AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BitNetConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, BltConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, CwmConfig, Data2VecTextConfig, DbrxConfig, DeepseekV2Config, DeepseekV3Config, DiffLlamaConfig, DogeConfig, Dots1Config, ElectraConfig, Emu3Config, ErnieConfig, Ernie4_5Config, Ernie4_5_MoeConfig, Exaone4Config, ExaoneMoeConfig, FalconConfig, FalconH1Config, FalconMambaConfig, FlexOlmoConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, Gemma3nConfig, Gemma3nTextConfig, GitConfig, GlmConfig, Glm4Config, Glm4MoeConfig, Glm4MoeLiteConfig, GlmMoeDsaConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GptOssConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeHybridConfig, GraniteMoeSharedConfig, HeliumConfig, HunYuanDenseV1Config, HunYuanMoEV1Config, Jais2Config, JambaConfig, JetMoeConfig, Lfm2Config, Lfm2MoeConfig, LlamaConfig, Llama4Config, Llama4TextConfig, LongcatFlashConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegatronBertConfig, MiniMaxConfig, MiniMaxM2Config, MinistralConfig, Ministral3Config, MistralConfig, MixtralConfig, MllamaConfig, ModernBertDecoderConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NanoChatConfig, NemotronConfig, NemotronHConfig, OlmoConfig, Olmo2Config, Olmo3Config, OlmoHybridConfig, OlmoeConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, ProphetNetConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3_5Config, Qwen3_5MoeConfig, Qwen3_5MoeTextConfig, Qwen3_5TextConfig, Qwen3MoeConfig, Qwen3NextConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, SeedOssConfig, SmolLM3Config, SolarOpenConfig, StableLmConfig, Starcoder2Config, TrOCRConfig, VaultGemmaConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, xLSTMConfig, XmodConfig, YoutuConfig, ZambaConfig, Zamba2Config.
Steps to reproduce the behavior.
Other information
- OS: MacOS (M4)
- Olive version: main
- ONNXRuntime package and version: X
- Transformers package version: transformers 5.4.0
Additional context
Add any other context about the problem here.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels