A Novel Self-training Approach for Low-resource Speech Recognition

Satwinder Singh; Feng Hou; Ruili Wang

doi:10.21437/Interspeech.2023-540

A Novel Self-training Approach for Low-resource Speech Recognition

Satwinder Singh, Feng Hou, Ruili Wang

Research output: Journal Publication › Conference article › peer-review

7 Citations (Scopus)

Abstract

In this paper, we propose a self-training approach for automatic speech recognition (ASR) for low-resource settings. While self-training approaches have been extensively developed and evaluated for high-resource languages such as English, their applications to low-resource languages like Punjabi have been limited, despite the language being spoken by millions globally. The scarcity of annotated data has hindered the development of accurate ASR systems, especially for low-resource languages (e.g., Punjabi and Māori languages). To address this issue, we propose an effective self-training approach that generates highly accurate pseudo-labels for unlabeled low-resource speech. Our experimental analysis demonstrates that our approach significantly improves word error rate, achieving a relative improvement of 14.94% compared to a baseline model across four real speech datasets. Further, our proposed approach reports the best results on the Common Voice Punjabi dataset.

Original language	English
Pages (from-to)	1588-1592
Number of pages	5
Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume	2023-August
DOIs	https://doi.org/10.21437/Interspeech.2023-540
Publication status	Published - 2023
Externally published	Yes
Event	24th International Speech Communication Association, Interspeech 2023 - Dublin, Ireland Duration: 20 Aug 2023 → 24 Aug 2023

Keywords

low-resource
Punjabi ASR
self-training

ASJC Scopus subject areas

Language and Linguistics
Human-Computer Interaction
Signal Processing
Software
Modelling and Simulation

Access to Document

10.21437/Interspeech.2023-540

Cite this

@article{328340ff227947ef832f85a01235fd8c,

title = "A Novel Self-training Approach for Low-resource Speech Recognition",

abstract = "In this paper, we propose a self-training approach for automatic speech recognition (ASR) for low-resource settings. While self-training approaches have been extensively developed and evaluated for high-resource languages such as English, their applications to low-resource languages like Punjabi have been limited, despite the language being spoken by millions globally. The scarcity of annotated data has hindered the development of accurate ASR systems, especially for low-resource languages (e.g., Punjabi and Māori languages). To address this issue, we propose an effective self-training approach that generates highly accurate pseudo-labels for unlabeled low-resource speech. Our experimental analysis demonstrates that our approach significantly improves word error rate, achieving a relative improvement of 14.94% compared to a baseline model across four real speech datasets. Further, our proposed approach reports the best results on the Common Voice Punjabi dataset.",

keywords = "low-resource, Punjabi ASR, self-training",

author = "Satwinder Singh and Feng Hou and Ruili Wang",

note = "Publisher Copyright: {\textcopyright} 2023 International Speech Communication Association. All rights reserved.; 24th International Speech Communication Association, Interspeech 2023 ; Conference date: 20-08-2023 Through 24-08-2023",

year = "2023",

doi = "10.21437/Interspeech.2023-540",

language = "English",

volume = "2023-August",

pages = "1588--1592",

journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

issn = "2308-457X",

}

TY - JOUR

T1 - A Novel Self-training Approach for Low-resource Speech Recognition

AU - Singh, Satwinder

AU - Hou, Feng

AU - Wang, Ruili

PY - 2023

Y1 - 2023

N2 - In this paper, we propose a self-training approach for automatic speech recognition (ASR) for low-resource settings. While self-training approaches have been extensively developed and evaluated for high-resource languages such as English, their applications to low-resource languages like Punjabi have been limited, despite the language being spoken by millions globally. The scarcity of annotated data has hindered the development of accurate ASR systems, especially for low-resource languages (e.g., Punjabi and Māori languages). To address this issue, we propose an effective self-training approach that generates highly accurate pseudo-labels for unlabeled low-resource speech. Our experimental analysis demonstrates that our approach significantly improves word error rate, achieving a relative improvement of 14.94% compared to a baseline model across four real speech datasets. Further, our proposed approach reports the best results on the Common Voice Punjabi dataset.

AB - In this paper, we propose a self-training approach for automatic speech recognition (ASR) for low-resource settings. While self-training approaches have been extensively developed and evaluated for high-resource languages such as English, their applications to low-resource languages like Punjabi have been limited, despite the language being spoken by millions globally. The scarcity of annotated data has hindered the development of accurate ASR systems, especially for low-resource languages (e.g., Punjabi and Māori languages). To address this issue, we propose an effective self-training approach that generates highly accurate pseudo-labels for unlabeled low-resource speech. Our experimental analysis demonstrates that our approach significantly improves word error rate, achieving a relative improvement of 14.94% compared to a baseline model across four real speech datasets. Further, our proposed approach reports the best results on the Common Voice Punjabi dataset.

KW - low-resource

KW - Punjabi ASR

KW - self-training

UR - http://www.scopus.com/inward/record.url?scp=85171531203&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2023-540

DO - 10.21437/Interspeech.2023-540

M3 - Conference article

AN - SCOPUS:85171531203

SN - 2308-457X

VL - 2023-August

SP - 1588

EP - 1592

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

T2 - 24th International Speech Communication Association, Interspeech 2023

Y2 - 20 August 2023 through 24 August 2023

ER -

A Novel Self-training Approach for Low-resource Speech Recognition

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this