|
Research
My research interest spans a wide range of deep generative models (AR, flow, GAN, diffusion,
etc.) applied to sequential data. Specifically, I am working on building multi-modal large language models
with a focus on audio.
During my Ph.D., I focused on time-domain waveform data (speech and audio) to advance generative modeling for audio.
I am also broadly interested in speech and audio applications, including text-to-speech, voice conversion, music generation, neural audio codecs, and audio language models.
Representative papers are highlighted.
|
|
|
Research Scientist @ NVIDIA
Jan 2024 - Current
In the Applied Deep Learning Research team, I am working on building multi-modal large language models with a focus on audio.
Sep 2021 - Jan 2022
As a research intern, I worked on improving neural vocoders for high quality speech and audio synthesis, advised by
Wei Ping and
Boris Ginsburg.
|
|
|
Senior Research Engineer @ Qualcomm AI Research
Feb 2023 - Jan 2024
I developed a framework for Text-to-Speech (TTS) research and development, optimized for deployment on edge devices.
|
|
|
Research Intern @ Microsoft Research Asia
Dec 2020 - May 2021
I worked on diffusion-based generative models for speech synthesis, advised by
Xu Tan,
Chang Liu,
Qi Meng, and
Tao Qin.
Dec 2018 - Feb 2019
I worked on the Antigen Map
Project,
where I applied sequence models to predict antigens from genetic sequences, advised by
Bin Shao.
|
|
|
Research Intern @ Kakao Corporation
Jul 2019 - Sep 2019
I worked on improving speech synthesis and voice conversion models, advised by
Jaehyeon Kim and Jaekyong Bae.
|
|
|
Ph.D. in Seoul National University
Electrical and Computer Engineering
Sep 2016 - Feb 2023
Dissertation: Deep Generative Model for Waveform Synthesis
Integrated M.S./Ph.D. Program. Advisor: Sungroh Yoon.
Dual B.S. in Seoul National University
Electrical and Computer Engineering / Applied Biology and Chemistry
Mar 2010 - Aug 2016
Cum Laude
|
|
Projects
During my time at DSAIL, I collaborated with Seoul
National University Hospital on a computer-aided diagnosis project for liver cancer.
The project yielded a high-performance medical object detection model to help reduce human errors from radiologists for the early detection of liver disease.
|
|
Invited Talks, Honors, and Awards
|
- Invited Talk "Deep Generative Model for Speech and Audio", Soongsil
University, 2023
- Invited Talk "Towards Universal Neural Waveform Synthesis", Naver, 2022
- Invited Talk "On Neural Waveform Synthesis", Supertone, 2022
- Invited Talk "Prior Enhancement for Deep Generative Models", Hyundai
AIRS,
2022
- Student Conference Scholarship, Google, 2022
- Invited Talk "Neural Speech Synthesis: a 2021 Landscape", NVIDIA,
2021
- Graduate Student of the Year, DSAIL, Seoul National University, 2019
- Best Paper Award, Hyundai AIR Lab (currently AIRS), 2019
- Stars of Tomorrow (Excellent Intern), Microsoft Research Asia,
2019
- Invited Talk "RNN Plus Alpha: Is RNN the False Prophet?", Naver CLOVA,
2018
- Cum Laude, Seoul National University, 2016
- Academic Performance Scholarship, Seoul National University, 2010 -
2016
- Academic Scholarship (fully funded), SBS Foundation, 2010 -
2016
 |
I am a PC hardware enthusiast, always eager to learn about computers in my free time.
As a hobbyist DJ, I enjoy house music. My mixes on YouTube
|
Last update: Jan 2026. Template borrowed from here.
|
|