Xilin Jiang

Notre-Dame de la Garde, Marseille, France

I am a PhD candidate at Columbia University in Electrical Engineering and Zuckerman Mind Brain Behavior Institute. Advised by Prof. Nima Mesgarani in the Neural Acoustic Processing Lab, I study audio generative AI and multimodal LLMs, aim to model and to assist human Listening, Thinking, and Speaking, spanning Speech-to-Text (audio understanding, speech recognition), Text-to-Speech (speech synthesis), and Speech-to-Speech (dialog system, speech enhancement) applications.

I obtained my BS degree from UIUC ECE, where I worked with Dr. Efthymios Tzinis and Prof. Paris Smaragdis on speech separation. I will intern in Meta Superintelligence this summer.

[Our lab often has undergraduate/master’s research opportunities (Columbia students). Feel free to reach out.]

Education

Columbia University, New York, NY

PhD candidate after joint MS in Electrical Engineering

Fall 2022 ~ Now💻☕, GPA: 4.12/4.00 (“A+”=4.33)
University of Illinois Urbana–Champaign, IL

BS in Computer Engineering, Highest Honor and Bronze Tablet 🏅🎓

Fall 2018 ~ Fall 2021, GPA: 4.00/4.00

Internship

Microsoft Research, Redmond, WA

Summer 2025, Research Intern

Mentors: Sebastian Braun & Hannes Gamper

Project: Sci-Phi: A Large Language Model Spatial Audio Descriptor
Amazon, Palo Alto, CA

Summer 2021 & 2022, SDE Intern

news

Jan 27, 2026	AVMeme Exam is out: A Multimodal Multilingual Multicultural Benchmark for LLMs’ Contextual and Cultural Knowledge and Thinking
Jan 17, 2026	My mentored paper SightSound-R1: Cross-Modal Reasoning Distillation from Vision to Audio Language Models Accepted to ICASSP 2026🎉
Jan 13, 2026	My MSR intern paper Sci-Phi: A Large Language Model Spatial Audio Descriptor Accepted to IEEE Open Journal of Signal Processing 🎉
Nov 07, 2025	DMOSpeech 2: Reinforcement Learning for Duration Prediction in Metric-Optimized Speech Synthesis Accepted to AAAI 2026 🎉
Oct 14, 2025	Bridging Ears&Eyes cross audio&visual LLM distill Won the Best Paper🥇 in WASPAA 2025