-
Notifications
You must be signed in to change notification settings - Fork 120
Description
System information
- CPU: AMD Ryzen 7 260 w/ Radeon 780M Graphics
- iGPU: AMD Radeon 780M Graphics
- Platform reported by AMD NPU installer: PHX platform with MCDM installed
- NPU device in Windows: "NPU Compute Accelerator Device"
- NPU device status: OK
- NPU device hardware ID from runtime detection: PCI\VEN_1022&DEV_1502&REV_00
- OS: Windows 11 Pro 64-bit
- OS build: 10.0.26200.8037
- BIOS version: HPTP3_F8_10_EC0003_BI0101_Thomson_54W
- Ryzen AI Software version: 1.7.0
- NPU driver originally installed on system before cleanup: 32.0.203.240
- NPU driver installed after clean reinstall: 32.0.203.314
- Conda distribution: Miniforge
- Recreated Conda environment after reinstall: ryzen-ai-1.7.0
Summary
I performed a full clean reinstall of the AMD Ryzen AI 1.7.0 software stack on a Phoenix/Hawk Point system and updated the NPU driver to 32.0.203.314. After cleanup and reinstall, the official AMD quicktest still fails inside the Vitis AI execution path. In addition, the official OGA example and the official Stable Diffusion demo also fail on the same system with low-level runtime errors.
At this point, the issue appears to be in the Ryzen AI runtime / Vitis AI / VAIP stack rather than in model files, Conda setup, or basic NPU detection.
What I removed during cleanup
- Uninstalled RyzenAI 1.7.0 from Windows installed apps
- Removed Conda environments:
- ryzen-ai-1.7.0
- ryzenai-sd
- ryzenai-stable-diffusion
- Deleted remaining folders:
- C:\Program Files\RyzenAI
- %USERPROFILE%\RyzenAI-Models
- %USERPROFILE%\Llama-2-7b-chat-hf-onnx-ryzenai-hybrid
- %USERPROFILE%\AMD_SD_TURBO_TMP
- %USERPROFILE%\sd_out_turbo
Driver reinstall procedure
- Downloaded NPU driver package 32.0.203.314
- Ran .\npu_sw_installer.exe as Administrator
- Installer detected:
- existing NPU MCDM driver: 32.0.203.240
- target NPU MCDM driver: 32.0.203.314
- Installer uninstalled the existing driver successfully
- Installer installed kipudrv.inf successfully
- Rebooted Windows
- Verified after reboot:
- PowerShell shows "NPU Compute Accelerator Device" with status OK
- pnputil shows kipudrv.inf present
Relevant post-reinstall checks
- Conda environment ryzen-ai-1.7.0 was recreated successfully by the Ryzen AI 1.7.0 installer
- Device status remained OK after reboot
- quicktest identifies the platform as PHX/HPT and proceeds into the Vitis AI path before failing
Issue 1: official quicktest.py initially fails on non-English Windows due to Unicode decoding
Before reaching the actual runtime path, the stock quicktest.py failed immediately with:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 6: invalid continuation byte
This happened in quicktest.py inside get_npu_info() on:
stdout.decode()
This seems to assume UTF-8 output from a Windows command, which is not always true on localized Windows installations.
To continue testing, I patched quicktest.py locally to decode the command output using a Windows-compatible codepage with error tolerance. After that local patch, the script proceeded correctly and detected the NPU type as PHX/HPT.
Issue 2: official quicktest.py then fails inside Vitis AI runtime
Command used:
conda activate ryzen-ai-1.7.0
set RYZEN_AI_INSTALLATION_PATH=C:\Program Files\RyzenAI\1.7.0
python "%RYZEN_AI_INSTALLATION_PATH%\quicktest\quicktest.py"
Observed output:
Setting environment for PHX/HPT
2026-03-18 13:14:22.7130477 [W:onnxruntime:, RedundantOpReductionPass.cpp:846 RedundantOpReductionPass.cpp] xir::Op{name = (/avgpool/GlobalAveragePool_output_0_Mul_vaip_163), type = qlinear-pool}'s input and output is unchanged, so it will be removed.
2026-03-18 13:14:23.0046779 [W:onnxruntime:, PartitionPass.cpp:12507 PartitionPass.cpp] xir::Op{name = (output)_replaced_input272, type = dequantize-linear} is partitioned to CPU as : doesn't supported by target [AMD_AIE2P_4x8_CMC_Overlay].
INFO: [aiecompiler 77-749] Reading logical device aie2p_8x4_device
Using TXN FORMAT 0.1
2026-03-18 13:14:24.9385520 [W:onnxruntime:, pass_main.cpp:250 pass_main.cpp] skip mmap in sg: 2. runner libgraph-engine.so not support mmap
F20260318 13:14:25.887775 9748 runner_requests_queue.cpp:165] -- Error: Failed to create runner: invalid vector subscript
2026-03-18 13:14:25.8885704 [F:onnxruntime:, runner_requests_queue.cpp:165 runner_requests_queue.cpp] -- Error: Failed to create runner: invalid vector subscript
2026-03-18 13:14:25.9543672 [F:onnxruntime:, runner_requests_queue.cpp:165 runner_requests_queue.cpp] -- Error: Failed to create runner: invalid vector subscript
2026-03-18 13:14:25.9575714 [E:onnxruntime:, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running vitis_ai_ep_1 node. Name:'VitisAIExecutionProvider_vitis_ai_ep_1_0' Status Message: ... Failed to create runner: invalid vector subscript
Final error returned by quicktest:
Failed to run the InferenceSession: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running vitis_ai_ep_1 node ... Failed to create runner: invalid vector subscript
This is after:
- clean uninstall
- clean driver reinstall
- reboot
- clean Ryzen AI reinstall
- recreated Conda environment
Issue 3: official OGA example also fails on the same system
I also tested the official OGA path after installing the required packages and using an AMD-prepared model.
Verified packages:
- onnxruntime-genai-directml-ryzenai 0.11.2
- model-generate 1.7.0
- model_generate.exe present in the ryzen-ai-1.7.0 environment
Model used:
amd/Llama-2-7b-chat-hf-onnx-ryzenai-hybrid
Commands used:
set RYZEN_AI_INSTALLATION_PATH=C:\Program Files\RyzenAI\1.7.0
git lfs install
git clone https://huggingface.co/amd/Llama-2-7b-chat-hf-onnx-ryzenai-hybrid
python "%RYZEN_AI_INSTALLATION_PATH%\LLM\example\run_model.py" -m "%USERPROFILE%\Llama-2-7b-chat-hf-onnx-ryzenai-hybrid" -l 256
Observed error:
Warning: Invalid or missing prompt input. Using default prompts.
DEPRECATED session option was used (config_entries): use 'session_options' directly instead.
Exception caught in MatMulNBits_8_0_0 constructor: Bad Argument
[E:onnxruntime:onnxruntime-genai, inference_session.cc:2545] Exception during initialization: invalid unordered_map<K, T> key
RuntimeError: Exception during initialization: invalid unordered_map<K, T> key
Issue 4: official Stable Diffusion demo also fails on the same system
I also tested the official GenAI-SD demo using AMD-prepared BFP models.
Example command:
python run_sd.py --model_id "stabilityai/sd-turbo" --model_path "%USERPROFILE%\RyzenAI-Models\sd_turbo_bfp" --custom_op_path "C:\Program Files\RyzenAI\1.7.0\deployment\onnx_custom_ops.dll" --output_path "%USERPROFILE%\sd_out_turbo"
Observed error:
---Loading ONNX Unet for DD
custom op path: C:\Program Files\RyzenAI\1.7.0\deployment\onnx_custom_ops.dll
[INFO] Load unet/dd ...
Failed to initialize fusion runtime for node 'NhwcConv_0-conv_inConv': invalid unordered_map<K, T> key
[E:onnxruntime:, inference_session.cc:2545] Exception during initialization: invalid unordered_map<K, T> key
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: invalid unordered_map<K, T> key
The same type of initialization failure was also observed with another AMD-prepared SD model path, so this does not appear to be a single bad model issue.
Why I believe this is a runtime issue
I tested:
- official AMD installer
- official AMD quicktest
- official AMD OGA example
- official AMD SD demo
- AMD-prepared model(s)
- fresh Conda environment created by installer
- fresh NPU driver update
- reboot after driver install
- NPU device detection successful in Windows
The failures occur in low-level runtime execution paths:
- quicktest: VitisAIExecutionProvider / runner creation / invalid vector subscript
- OGA: onnxruntime-genai initialization / invalid unordered_map<K, T> key
- Stable Diffusion: fusion runtime initialization / invalid unordered_map<K, T> key
These symptoms strongly suggest a bug or incompatibility in the Ryzen AI 1.7.0 runtime stack on this system rather than a user setup error.
Questions
- Is this a known issue on Phoenix/Hawk Point systems with Ryzen AI Software 1.7.0?
- Is there a recommended driver version for Ryzen AI 1.7.0 on PHX/HPT beyond 32.0.203.314?
- Is there a known-good runtime / driver combination for:
- quicktest
- OGA hybrid models
- GenAI-SD
- Is the quicktest.py UnicodeDecodeError on localized Windows already known and fixed internally?
- Is there any debug logging mode or additional validation tool I can run to provide more useful diagnostics?
I can provide:
- full quicktest console output
- PowerShell device enumeration output
- pnputil driver enumeration output
- screenshots of the clean reinstall steps
- full OGA and Stable Diffusion error logs
Attachments available on request.
I can also test another recommended driver/runtime combination if AMD suggests one.