Skip to content

opt: fix conv1d and speedup qwen3next shape#198

Open
Xu-pan wants to merge 3 commits intomasterfrom
xupan/opt_conv1d
Open

opt: fix conv1d and speedup qwen3next shape#198
Xu-pan wants to merge 3 commits intomasterfrom
xupan/opt_conv1d

Conversation

@Xu-pan
Copy link
Copy Markdown
Contributor

@Xu-pan Xu-pan commented Mar 31, 2026

MR 描述

本次 MR 主要针对 TTXCausalConv1dUpdateState 在小 shape (1, 1, 32, 4, None) 下的 device latency 做优化,目标是在尽量小改动的前提下改善该热点 case 的执行效率。

优化结果

  • TTXdevice time12.3520 us 降低到 4.6200 us
  • 等价于 2.67x 加速,或 62.6%device time 降低
  • 在同一 case 下,相比 Torch reference48.4896 usTTX 当前约快 10.5x

@Xu-pan Xu-pan force-pushed the xupan/opt_conv1d branch from 3863822 to 54600e1 Compare April 2, 2026 07:35
@Xu-pan Xu-pan changed the title [wip] opt: fix conv1d and speedup qwen3next shape opt: fix conv1d and speedup qwen3next shape Apr 3, 2026
@zhangjihang-BD zhangjihang-BD requested a review from zhanzy178 April 3, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant