Skip to main content

Bio: Hao-Wen (Herman) Dong is a PhD Candidate in Computer Science at University of California San Diego working with Prof. Julian McAuley and Prof. Taylor Berg-Kirkpatrick. Herman’s research aims to empower music and audio creation with machine learning. His long-term goal is to lower the barrier of entry for music composition and democratize audio content creation. He is broadly interested in music generation, audio synthesis, multimodal machine learning and music information retrieval. He has collaborated with researchers at NVIDIA, Adobe, Dolby, Amazon, Sony and Yamaha through internships. Prior to his PhD, he was a research assistant at Academia Sinica working with Prof. Yi-Hsuan Yang. Herman’s research has been recognized by the ICASSP Rising Stars in Signal Processing and UCSD GPSA Interdisciplinary Research Award. His PhD study has been supported by IEEE SPS Scholarship, Taiwan Government Scholarship to Study Abroad, J. Yang Scholarship and UCSD ECE Department Fellowship.

Talk Title: Generative AI for Music and Audio

Abstract: Generative AI has been transforming the way we interact with technology and consume content. In this talk, I will briefly introduce the three main directions of my research centered around generative AI for music and audio: 1) multitrack music generation, 2) assistive music creation tools, and 3) multimodal learning for audio and music. I will then zoom into my recent work on learning text-to-audio synthesis from videos using pretrained language-vision models and diffusion models. Finally, I will close this talk by discussing the challenges and future directions of generative AI for music and audio.

arrow-left-smallarrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-long-yellowarrow-right-smallfacet-arrow-down-whitefacet-arrow-downCheckedCheckedlink-outmag-glass