TY - JOUR
T1 - Adversarial multi-task learning with inverse mapping for speech enhancement
AU - Qiu, Yuanhang
AU - Wang, Ruili
AU - Hou, Feng
AU - Singh, Satwinder
AU - Ma, Zhizhong
AU - Jia, Xiaoyun
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/5
Y1 - 2022/5
N2 - Adversarial Multi-Task Learning (AMTL) has demonstrated its promising capability of information capturing and representation learning, however, is hardly explored in speech enhancement. In this paper, we propose a novel adversarial multi-task learning with inverse mapping method for speech enhancement. Our method focuses on enhancing the generator's capability of speech information capturing and representation learning. To implement this method, two extra networks (namely P and Q) are developed to establish the inverse mapping from the generated distribution to the input data domains. Correspondingly, two new loss functions (i.e., latent loss and equilibrium loss) are proposed for the inverse mapping learning and the enhancement model training with the original adversarial loss. Our method obtains the state-of-the-art performance in terms of speech quality (PESQ=2.93, CVOL=3.55). For speech intelligibility, our method can also obtain competitive performance (STOI=0.947). The experimental results demonstrate that our method can effectively improve speech representation learning and speech enhancement performance.
AB - Adversarial Multi-Task Learning (AMTL) has demonstrated its promising capability of information capturing and representation learning, however, is hardly explored in speech enhancement. In this paper, we propose a novel adversarial multi-task learning with inverse mapping method for speech enhancement. Our method focuses on enhancing the generator's capability of speech information capturing and representation learning. To implement this method, two extra networks (namely P and Q) are developed to establish the inverse mapping from the generated distribution to the input data domains. Correspondingly, two new loss functions (i.e., latent loss and equilibrium loss) are proposed for the inverse mapping learning and the enhancement model training with the original adversarial loss. Our method obtains the state-of-the-art performance in terms of speech quality (PESQ=2.93, CVOL=3.55). For speech intelligibility, our method can also obtain competitive performance (STOI=0.947). The experimental results demonstrate that our method can effectively improve speech representation learning and speech enhancement performance.
KW - Adversarial multi-task learning
KW - Deep neural networks
KW - Inverse mapping learning
KW - Speech enhancement
UR - http://www.scopus.com/inward/record.url?scp=85126010784&partnerID=8YFLogxK
U2 - 10.1016/j.asoc.2022.108568
DO - 10.1016/j.asoc.2022.108568
M3 - Article
AN - SCOPUS:85126010784
SN - 1568-4946
VL - 120
JO - Applied Soft Computing Journal
JF - Applied Soft Computing Journal
M1 - 108568
ER -