MS student @ NTU. Working on multimodal and speech foundation models. Trying to make machines see, hear, and reason together.
-
National Taiwan University
- Taipei, Taiwan
-
01:39
(UTC +03:00) -
Joined
Mar 15, 2026
Popular repositories Loading
-
mm-reason-bench
mm-reason-bench PublicMultimodal reasoning benchmark for vision-language-audio models with cross-modal dependency analysis
Python
-
unified-speech-codec
unified-speech-codec PublicUnified speech codec framework with pluggable quantization (RVQ, BSQ) and semantic supervision
Python
-
-
-
-
WAM-Flow
WAM-Flow PublicForked from fudan-generative-vision/WAM-Flow
[CVPR 2026] WAM-Flow: Parallel Coarse-to-Fine Motion Planning via Discrete Flow Matching for Autonomous Driving
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.