 
 
Whiting School of Engineering,
 Johns Hopkins University, Baltimore
Email: hwang258@jhu.edu
Phone: +1 667-354-9059
I am a PhD candidate at Johns Hopkins University and expect to graduate in 2026. Before that, I received my Bachelor's Degree from Tsinghua University in 2019 and my Master's Degree from Peking University in 2022. My research interest majorly lies in AI for speech and audio signal processing, encompassing audio generation tasks such as source separation and text-to-speech synthesis, as well as audio understanding tasks like audio
classification and captioning. Moreover, I have conducted research or held internship positions at leading companies including Meta, Microsoft, Tencent, Amazon, and Zoom. 
(Note: I am actively looking for any collaboration opportunities. Please feel free to contact me.)
May, 2025: I have got 1000 citations in the Google Scholar!
Dec, 2024: Our TASLP paper "Diffsound: Discrete Diffusion Model for Text-to-Sound Generation" has been selected for the 2024 IEEE SPS Young Author Best Paper Award!
Sep, 2024: Our INTERSPEECH paper "Noise-robust Speech Separation with Fast Generative Correction" has been nominated for the Best Student Paper Award and the Best Paper Award (5 out of 1,030 accepted papers)!
May, 2022: I have got Outstanding Graduate Student & Thesis Award of Peking University!
April, 2022: I have got 100 citations in the Google Scholar!
Machine Learning, Audio Processing, Speech Processing, Artificial Intelligence
May 2025 - Now
      Meta FAIR,  AudioBox Team, New York, USA, Research Scientist Intern. 
 Supervisors:  Bowen Shi and  Wei-Ning Hsu
   
August 2022 - Now
      Johns Hopkins University,  Center for Language and Speech Processing (CLSP), Baltimore, USA, Research Assistant. 
 Supervisors:  Najim Dehak,   Laureano Moro-Velázquez and  Jesús Villalba
   
May 2024 - August 2024
      Tencent AI Lab,  Speech Group, Bellevue, USA, Intern. 
 Supervisors:  Meng Yu  and  Dong Yu 
   
December 2022 - December 2023
      Amazon General Intelligence (AGI),  Speech Team, Baltimore, USA, Student Researcher. 
 Supervisors:  Venkatesh Ravichandran  and  Milind Rao 
   
February 2022 - May 2022
      Microsoft STCA,  NLP Group, Beijing, China, Intern. 
 Supervisors:  Linjun Shou  and  Ming Gong 
   
May 2020 - November 2021
      Tencent AI Lab,  Speech Group, Shenzhen, China, Intern. 
 Supervisors:  Bo Wu  and  Chao Weng
   
August 2019 - July 2022
      Peking University,  ADSP Lab, Shenzhen, China, master student. 
 Supervisor:  Yuexian Zou 
 Co-author:  Wenwu Wang
   
July 2019 - September 2019
      Ubtech Robotics Inc., Speech Group, Shenzhen, China, Intern. 
 Supervisor:  Dongyan Huang
   
July 2018 - September 2018
      University of California Berkeley,  California Path Lab, Berkeley, USA, Summer Research. 
 Supervisor:  Masayoshi Tomizuka
   
Helin Wang, Meng Yu, Jiarui Hai, Chen Chen, Yuchen Hu, Rilin Chen, Najim Dehak, Dong Yu
     SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis 
      ICASSP, 2025.
    [Code]
   
Helin Wang*, Jiarui Hai*, Yen-Ju Lu, Karan Thakkar, Mounya Elhilali, Najim Dehak
     SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer 
      ICASSP, 2025.
    [Code]
   
Helin Wang, Jesus Villalba, Laureano Moro-Velazquez, Jiarui Hai, Thomas Thebaud, Najim Dehak
     Noise-robust Speech Separation with Fast Generative Correction 
    Interspeech, 2024.
    [Code]
   
Jiarui Hai*,Helin Wang*, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali
     DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction 
    ICASSP, 2024.
    [Code]
   
Helin Wang, Venkatesh Ravichandran, Milind Rao, Becky Lammers, Becky Lammers, Myra Sydnor, Nicholas Maragakis, Ankur A. Butala, Jayne Zhang, Lora Clawson, Victoria Chovaz, Laureano Moro-Velazquez
     Improving fairness for spoken language understanding in atypical speech with Text-to-Speech 
    NeurIPS Workshop on Synthetic Data Generation with Generative AI, 2023.
    [Code]
   
Helin Wang, Thomas Thebaud, Jesus Villalba, Myra Sydnor, Becky Lammers, Najim Dehak, Laureano Moro-Velazquez
     DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Probabilistic Model 
    Interspeech, 2023.
    [Code]
   
Dading Chong*,Helin Wang*, Peilin Zhou, Qingcheng Zeng
     Masked Spectrogram Prediction For Self- Supervised Audio Pre-Training 
    ICASSP, 2023.
    [Code]
   
Helin Wang, Yuexian Zou, Wenwu Wang 
     SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification 
    Interspeech, 2021.
    [Code]
   
        2025/02 - 2025/05, Teaching Assistant, Johns Hopkins University, Baltimore, USA:
        EN.520.439/659: Machine Learning for Medical Applications in Spring 2025 at the Department of Electrical and Computer Engineering
   
        2024/02 - 2024/05, Teaching Assistant, Johns Hopkins University, Baltimore, USA:
        EN.520.123: Computational Modeling for Electrical and Computer Engineering in Spring 2024 at the Department of Electrical and Computer Engineering