Adversarial multi-task learning with inverse mapping for speech enhancement

Yuanhang Qiu; Ruili Wang; Feng Hou; Satwinder Singh; Zhizhong Ma; Xiaoyun Jia

doi:10.1016/j.asoc.2022.108568

Adversarial multi-task learning with inverse mapping for speech enhancement

Yuanhang Qiu, Ruili Wang, Feng Hou, Satwinder Singh, Zhizhong Ma, Xiaoyun Jia

Research output: Journal Publication › Article › peer-review

13 Citations (Scopus)

Abstract

Adversarial Multi-Task Learning (AMTL) has demonstrated its promising capability of information capturing and representation learning, however, is hardly explored in speech enhancement. In this paper, we propose a novel adversarial multi-task learning with inverse mapping method for speech enhancement. Our method focuses on enhancing the generator's capability of speech information capturing and representation learning. To implement this method, two extra networks (namely P and Q) are developed to establish the inverse mapping from the generated distribution to the input data domains. Correspondingly, two new loss functions (i.e., latent loss and equilibrium loss) are proposed for the inverse mapping learning and the enhancement model training with the original adversarial loss. Our method obtains the state-of-the-art performance in terms of speech quality (PESQ=2.93, CVOL=3.55). For speech intelligibility, our method can also obtain competitive performance (STOI=0.947). The experimental results demonstrate that our method can effectively improve speech representation learning and speech enhancement performance.

Original language	English
Article number	108568
Journal	Applied Soft Computing Journal
Volume	120
DOIs	https://doi.org/10.1016/j.asoc.2022.108568
Publication status	Published - May 2022
Externally published	Yes

Keywords

Adversarial multi-task learning
Deep neural networks
Inverse mapping learning
Speech enhancement

ASJC Scopus subject areas

Software

Access to Document

10.1016/j.asoc.2022.108568

Cite this

@article{f1c7deaa520940d2bb9f63fe1f26fcd3,

title = "Adversarial multi-task learning with inverse mapping for speech enhancement",

abstract = "Adversarial Multi-Task Learning (AMTL) has demonstrated its promising capability of information capturing and representation learning, however, is hardly explored in speech enhancement. In this paper, we propose a novel adversarial multi-task learning with inverse mapping method for speech enhancement. Our method focuses on enhancing the generator's capability of speech information capturing and representation learning. To implement this method, two extra networks (namely P and Q) are developed to establish the inverse mapping from the generated distribution to the input data domains. Correspondingly, two new loss functions (i.e., latent loss and equilibrium loss) are proposed for the inverse mapping learning and the enhancement model training with the original adversarial loss. Our method obtains the state-of-the-art performance in terms of speech quality (PESQ=2.93, CVOL=3.55). For speech intelligibility, our method can also obtain competitive performance (STOI=0.947). The experimental results demonstrate that our method can effectively improve speech representation learning and speech enhancement performance.",

keywords = "Adversarial multi-task learning, Deep neural networks, Inverse mapping learning, Speech enhancement",

author = "Yuanhang Qiu and Ruili Wang and Feng Hou and Satwinder Singh and Zhizhong Ma and Xiaoyun Jia",

note = "Publisher Copyright: {\textcopyright} 2022 Elsevier B.V.",

year = "2022",

month = may,

doi = "10.1016/j.asoc.2022.108568",

language = "English",

volume = "120",

journal = "Applied Soft Computing Journal",

issn = "1568-4946",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Adversarial multi-task learning with inverse mapping for speech enhancement

AU - Qiu, Yuanhang

AU - Wang, Ruili

AU - Hou, Feng

AU - Singh, Satwinder

AU - Ma, Zhizhong

AU - Jia, Xiaoyun

PY - 2022/5

Y1 - 2022/5

N2 - Adversarial Multi-Task Learning (AMTL) has demonstrated its promising capability of information capturing and representation learning, however, is hardly explored in speech enhancement. In this paper, we propose a novel adversarial multi-task learning with inverse mapping method for speech enhancement. Our method focuses on enhancing the generator's capability of speech information capturing and representation learning. To implement this method, two extra networks (namely P and Q) are developed to establish the inverse mapping from the generated distribution to the input data domains. Correspondingly, two new loss functions (i.e., latent loss and equilibrium loss) are proposed for the inverse mapping learning and the enhancement model training with the original adversarial loss. Our method obtains the state-of-the-art performance in terms of speech quality (PESQ=2.93, CVOL=3.55). For speech intelligibility, our method can also obtain competitive performance (STOI=0.947). The experimental results demonstrate that our method can effectively improve speech representation learning and speech enhancement performance.

AB - Adversarial Multi-Task Learning (AMTL) has demonstrated its promising capability of information capturing and representation learning, however, is hardly explored in speech enhancement. In this paper, we propose a novel adversarial multi-task learning with inverse mapping method for speech enhancement. Our method focuses on enhancing the generator's capability of speech information capturing and representation learning. To implement this method, two extra networks (namely P and Q) are developed to establish the inverse mapping from the generated distribution to the input data domains. Correspondingly, two new loss functions (i.e., latent loss and equilibrium loss) are proposed for the inverse mapping learning and the enhancement model training with the original adversarial loss. Our method obtains the state-of-the-art performance in terms of speech quality (PESQ=2.93, CVOL=3.55). For speech intelligibility, our method can also obtain competitive performance (STOI=0.947). The experimental results demonstrate that our method can effectively improve speech representation learning and speech enhancement performance.

KW - Adversarial multi-task learning

KW - Deep neural networks

KW - Inverse mapping learning

KW - Speech enhancement

UR - http://www.scopus.com/inward/record.url?scp=85126010784&partnerID=8YFLogxK

U2 - 10.1016/j.asoc.2022.108568

DO - 10.1016/j.asoc.2022.108568

M3 - Article

AN - SCOPUS:85126010784

SN - 1568-4946

VL - 120

JO - Applied Soft Computing Journal

JF - Applied Soft Computing Journal

M1 - 108568

ER -

Adversarial multi-task learning with inverse mapping for speech enhancement

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this