| Jan 27, 2026 | AVMeme Exam is public: A Multimodal Multilingual Multicultural Benchmark for LLMs’ Contextual and Cultural Knowledge and Thinking |
| Jan 17, 2026 | My mentored paper SightSound-R1: Cross-Modal Reasoning Distillation from Vision to Audio Language Models Accepted to ICASSP 2026🎉 |
| Jan 13, 2026 | My MSR intern paper Sci-Phi: A Large Language Model Spatial Audio Descriptor Accepted to IEEE Open Journal of Signal Processing 🎉 |
| Nov 07, 2025 | DMOSpeech 2: Reinforcement Learning for Duration Prediction in Metric-Optimized Speech Synthesis Accepted to AAAI 2026 🎉 |
| Oct 14, 2025 | Bridging Ears&Eyes cross audio&visual LLM distill Won the Best Paper🥇 in WASPAA 2025 |