Helin Wang

Whiting School of Engineering,
Johns Hopkins University, Baltimore

Email: hwang258@jhu.edu

Phone: +1 667-354-9059

General

I am a research scientist at Bytedance Seed. I got my PhD from Johns Hopkins University in 2026. My research interest majorly lies in AI for speech and audio signal processing.

News

March, 2026: I defended my PhD successfully!

December, 2025: SAM Audio launched!

May, 2025: I have got 1000 citations in the Google Scholar!

Dec, 2024: Our TASLP paper "Diffsound: Discrete Diffusion Model for Text-to-Sound Generation" has been selected for the 2024 IEEE SPS Young Author Best Paper Award!

Sep, 2024: Our INTERSPEECH paper "Noise-robust Speech Separation with Fast Generative Correction" has been nominated for the Best Student Paper Award and the Best Paper Award (5 out of 1,030 accepted papers)!

May, 2022: I have got Outstanding Graduate Student & Thesis Award of Peking University!

April, 2022: I have got 100 citations in the Google Scholar!

Research Interests

Machine Learning, Audio Processing, Speech Processing, Artificial Intelligence

Selected Works

Bowen Shi*, Andros Tjandra*, John Hoffman*, Helin Wang*, Yi-Chiao Wu*, Luya Gao*, Julius Richter, Matt Le, Apoorv Vyas, Sanyuan Chen, Christoph Feichtenhofer, Piotr Dollár, Wei-Ning Hsu, Ann Lee
SAM Audio: Segment Anything in Audio
Preprint, 2025. [Code]

Helin Wang*, Jiarui Hai*, Dading Chong, Karan Thakkar, Tiantian Feng, Dongchao Yang, Junhyeok Lee, Thomas Thebaud, Laureano Moro Velazquez, Jesus Villalba, Zengyi Qin, Shrikanth Narayanan, Mounya Elhiali, Najim Dehak
Capspeech: Enabling downstream applications in style-captioned text-to-speech
Preprint, 2025. [Code]

Helin Wang, Jiarui Hai, Dongchao Yang, Chen Chen, Kai Li, Junyi Peng, Thomas Thebaud, Laureano Moro Velazquez, Jesus Villalba, Najim Dehak
SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline
Preprint, 2025. [Code]

Helin Wang, Meng Yu, Jiarui Hai, Chen Chen, Yuchen Hu, Rilin Chen, Najim Dehak, Dong Yu
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
ICASSP, 2025. [Code]

Helin Wang*, Jiarui Hai*, Yen-Ju Lu, Karan Thakkar, Mounya Elhilali, Najim Dehak
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer
ICASSP, 2025. [Code]

Educations

Department of Electronic and Computer Engineering, Johns Hopkins University

August 2022 - Now

School of Electronic and Computer Engineering, Peking University

August 2019 - July 2022

Department of Automation, Tsinghua University

August 2015 - July 2019

Experiences

May 2025 - December 2025
Meta FAIR, AudioBox Team, New York, USA, Research Scientist Intern.
Supervisors: Wei-Ning Hsu and Bowen Shi

August 2022 - March 2026
Johns Hopkins University, Center for Language and Speech Processing (CLSP), Baltimore, USA, Research Assistant.
Supervisors: Najim Dehak, Laureano Moro-Velázquez and Jesús Villalba

May 2024 - August 2024
Tencent AI Lab, Speech Group, Bellevue, USA, Intern.
Supervisors: Meng Yu and Dong Yu

December 2022 - December 2023
Amazon General Intelligence (AGI), Speech Team, Baltimore, USA, Student Researcher.
Supervisors: Venkatesh Ravichandran and Milind Rao

February 2022 - May 2022
Microsoft STCA, NLP Group, Beijing, China, Intern.
Supervisors: Linjun Shou and Ming Gong

May 2020 - November 2021
Tencent AI Lab, Speech Group, Shenzhen, China, Intern.
Supervisors: Bo Wu and Chao Weng

August 2019 - July 2022
Peking University, ADSP Lab, Shenzhen, China, master student.
Supervisor: Yuexian Zou
Co-author: Wenwu Wang

July 2019 - September 2019
Ubtech Robotics Inc., Speech Group, Shenzhen, China, Intern.
Supervisor: Dongyan Huang

July 2018 - September 2018
University of California Berkeley, California Path Lab, Berkeley, USA, Summer Research.
Supervisor: Masayoshi Tomizuka

Teaching

2026/01 - 2026/03, Teaching Assistant, Johns Hopkins University, Baltimore, USA:
EN.520.661: AI for Biometric Systems: Techniques, Applications and Ethics in Spring 2026 at the Department of Electrical and Computer Engineering

2025/02 - 2025/05, Teaching Assistant, Johns Hopkins University, Baltimore, USA:
EN.520.439/659: Machine Learning for Medical Applications in Spring 2025 at the Department of Electrical and Computer Engineering

2024/02 - 2024/05, Teaching Assistant, Johns Hopkins University, Baltimore, USA:
EN.520.123: Computational Modeling for Electrical and Computer Engineering in Spring 2024 at the Department of Electrical and Computer Engineering