For voice communication, it is important to suppress background noise without introducing unnatural distortion. Deep learning-based speech enhancement approaches can effectively suppress background noise components.
However, in the noise-mismatched condition, unnatural residual noise is generated and it heavily influences speech comfortableness.
Recently, researchers from the Institute of Acoustics of the Chinese Academy of Sciences (IACAS) proposed a type of supervised speech enhancement approach with residual noise control for voice communication.
Based on artificially maintaining low-level residual noise, researchers dedicated to maximizing noise reduction and minimizing speech distortion jointly, leading to better perceptual comfortableness of enhanced speech.
Facing the widely-existing disadvantages of loss functions, researchers introduced multiple adjustable hyper-parameters and derived a generalized loss function.
They selected suitable parameter configurations, making the enhanced speech weigh flexibly and effectively between the two objectives. Meanwhile, by introducing low-level background noise, they improved the subjective perceptual quality.
Experimental results showed that choosing suitable parameter configurations could make the enhanced speech outperform previous works in terms of both objective metrics and subjective evaluation results.
This work could be utilized for noise suppression and speech information extraction in the speech communication devices.
The study, published in Applied Sciences, was supported by the National Natural Science Foundation of China.
52 Sanlihe Rd., Beijing,
Copyright © 2002 - Chinese Academy of Sciences