A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation

Yuan Gao; Feng Hou; Ruili Wang

doi:10.18653/v1/2024.findings-naacl.203

A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation

Yuan Gao, Feng Hou, Ruili Wang

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

4 Citations (Scopus)

Abstract

Existing transfer learning methods for neural machine translation typically use a well-trained translation model (i.e., a parent model) of a high-resource language pair to directly initialize a translation model (i.e., a child model) of a low-resource language pair, and the child model is then fine-tuned with corresponding datasets. In this paper, we propose a novel two-step fine-tuning (TSFT) framework for transfer learning in low-resource neural machine translation. In the first step, we adjust the parameters of the parent model to fit the child language by using the child source data. In the second step, we transfer the adjusted parameters to the child model and fine-tune it with a proposed distillation loss for efficient optimization. Our experimental results on five low-resource translations demonstrate that our framework yields significant improvements over various strong transfer learning baselines. Further analysis demonstrated the effectiveness of different components in our framework.

Original language	English
Title of host publication	Findings of the Association for Computational Linguistics
Subtitle of host publication	NAACL 2024 - Findings
Editors	Kevin Duh, Helena Gomez, Steven Bethard
Publisher	Association for Computational Linguistics (ACL)
Pages	3214-3224
Number of pages	11
ISBN (Electronic)	9798891761193
DOIs	https://doi.org/10.18653/v1/2024.findings-naacl.203
Publication status	Published - 2024
Externally published	Yes
Event	2024 Findings of the Association for Computational Linguistics: NAACL 2024 - Mexico City, Mexico Duration: 16 Jun 2024 → 21 Jun 2024

Publication series

Name	Findings of the Association for Computational Linguistics: NAACL 2024 - Findings

Conference

Conference	2024 Findings of the Association for Computational Linguistics: NAACL 2024
Country/Territory	Mexico
City	Mexico City
Period	16/06/24 → 21/06/24

ASJC Scopus subject areas

Computational Theory and Mathematics
Software

Access to Document

10.18653/v1/2024.findings-naacl.203

Cite this

Gao, Y., Hou, F., & Wang, R. (2024). A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation. In K. Duh, H. Gomez, & S. Bethard (Eds.), Findings of the Association for Computational Linguistics: NAACL 2024 - Findings (pp. 3214-3224). (Findings of the Association for Computational Linguistics: NAACL 2024 - Findings). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2024.findings-naacl.203

Gao, Yuan ; Hou, Feng ; Wang, Ruili. / A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation. Findings of the Association for Computational Linguistics: NAACL 2024 - Findings. editor / Kevin Duh ; Helena Gomez ; Steven Bethard. Association for Computational Linguistics (ACL), 2024. pp. 3214-3224 (Findings of the Association for Computational Linguistics: NAACL 2024 - Findings).

@inproceedings{7e84960b96f84faabe0dd14cfcc83891,

title = "A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation",

abstract = "Existing transfer learning methods for neural machine translation typically use a well-trained translation model (i.e., a parent model) of a high-resource language pair to directly initialize a translation model (i.e., a child model) of a low-resource language pair, and the child model is then fine-tuned with corresponding datasets. In this paper, we propose a novel two-step fine-tuning (TSFT) framework for transfer learning in low-resource neural machine translation. In the first step, we adjust the parameters of the parent model to fit the child language by using the child source data. In the second step, we transfer the adjusted parameters to the child model and fine-tune it with a proposed distillation loss for efficient optimization. Our experimental results on five low-resource translations demonstrate that our framework yields significant improvements over various strong transfer learning baselines. Further analysis demonstrated the effectiveness of different components in our framework.",

author = "Yuan Gao and Feng Hou and Ruili Wang",

note = "Publisher Copyright: {\textcopyright} 2024 Association for Computational Linguistics.; 2024 Findings of the Association for Computational Linguistics: NAACL 2024 ; Conference date: 16-06-2024 Through 21-06-2024",

year = "2024",

doi = "10.18653/v1/2024.findings-naacl.203",

language = "English",

series = "Findings of the Association for Computational Linguistics: NAACL 2024 - Findings",

publisher = "Association for Computational Linguistics (ACL)",

pages = "3214--3224",

editor = "Kevin Duh and Helena Gomez and Steven Bethard",

booktitle = "Findings of the Association for Computational Linguistics",

address = "United States",

}

Gao, Y, Hou, F & Wang, R 2024, A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation. in K Duh, H Gomez & S Bethard (eds), Findings of the Association for Computational Linguistics: NAACL 2024 - Findings. Findings of the Association for Computational Linguistics: NAACL 2024 - Findings, Association for Computational Linguistics (ACL), pp. 3214-3224, 2024 Findings of the Association for Computational Linguistics: NAACL 2024, Mexico City, Mexico, 16/06/24. https://doi.org/10.18653/v1/2024.findings-naacl.203

A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation. / Gao, Yuan; Hou, Feng; Wang, Ruili.
Findings of the Association for Computational Linguistics: NAACL 2024 - Findings. ed. / Kevin Duh; Helena Gomez; Steven Bethard. Association for Computational Linguistics (ACL), 2024. p. 3214-3224 (Findings of the Association for Computational Linguistics: NAACL 2024 - Findings).

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation

AU - Gao, Yuan

AU - Hou, Feng

AU - Wang, Ruili

PY - 2024

Y1 - 2024

N2 - Existing transfer learning methods for neural machine translation typically use a well-trained translation model (i.e., a parent model) of a high-resource language pair to directly initialize a translation model (i.e., a child model) of a low-resource language pair, and the child model is then fine-tuned with corresponding datasets. In this paper, we propose a novel two-step fine-tuning (TSFT) framework for transfer learning in low-resource neural machine translation. In the first step, we adjust the parameters of the parent model to fit the child language by using the child source data. In the second step, we transfer the adjusted parameters to the child model and fine-tune it with a proposed distillation loss for efficient optimization. Our experimental results on five low-resource translations demonstrate that our framework yields significant improvements over various strong transfer learning baselines. Further analysis demonstrated the effectiveness of different components in our framework.

AB - Existing transfer learning methods for neural machine translation typically use a well-trained translation model (i.e., a parent model) of a high-resource language pair to directly initialize a translation model (i.e., a child model) of a low-resource language pair, and the child model is then fine-tuned with corresponding datasets. In this paper, we propose a novel two-step fine-tuning (TSFT) framework for transfer learning in low-resource neural machine translation. In the first step, we adjust the parameters of the parent model to fit the child language by using the child source data. In the second step, we transfer the adjusted parameters to the child model and fine-tune it with a proposed distillation loss for efficient optimization. Our experimental results on five low-resource translations demonstrate that our framework yields significant improvements over various strong transfer learning baselines. Further analysis demonstrated the effectiveness of different components in our framework.

UR - http://www.scopus.com/inward/record.url?scp=85197902914&partnerID=8YFLogxK

U2 - 10.18653/v1/2024.findings-naacl.203

DO - 10.18653/v1/2024.findings-naacl.203

M3 - Conference contribution

AN - SCOPUS:85197902914

T3 - Findings of the Association for Computational Linguistics: NAACL 2024 - Findings

SP - 3214

EP - 3224

BT - Findings of the Association for Computational Linguistics

A2 - Duh, Kevin

A2 - Gomez, Helena

A2 - Bethard, Steven

PB - Association for Computational Linguistics (ACL)

T2 - 2024 Findings of the Association for Computational Linguistics: NAACL 2024

Y2 - 16 June 2024 through 21 June 2024

ER -

Gao Y, Hou F, Wang R. A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation. In Duh K, Gomez H, Bethard S, editors, Findings of the Association for Computational Linguistics: NAACL 2024 - Findings. Association for Computational Linguistics (ACL). 2024. p. 3214-3224. (Findings of the Association for Computational Linguistics: NAACL 2024 - Findings). doi: 10.18653/v1/2024.findings-naacl.203

A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this