I am a PhD candidate at the College of Computer Science and Technology, Zhejiang University (浙江大学计算机学院).
I work on the Audio Research Team at Zhejiang University, under the supervision of Prof. Zhou Zhao (赵洲). Previously, I graduated from Chu Kochen Honors College, Zhejiang University (浙江大学竺可桢学院), with dual bachelor’s degrees in Computer Science and Automation. I have also served as a visiting scholar at University of Rochester with Prof. Zhiyao Duan and University of Massachusetts Amherst with Prof. Przemyslaw Grabowicz.
My research interests primarily focus on Multi-Modal Generative AI, specifically in Spatial Audio, Music, Singing, and Speech. I have published first-author papers at top international AI conferences, such as NeurIPS, ACL, AAAI, and EMNLP. Currently, I am working on spatial audio generation with multimodal prompts and streaming voice conversion.
I am actively seeking research collaborations. Please feel free to contact me via email at aaron9834@icloud.com.