Free-FreeSLT: A Gloss-Free Parameter-Free model for Sign Language Translation

Weirong Sun, Yujun Ma, Ruili Wang

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

Sign language translation (SLT) is a demanding task involving integrating visual and linguistic information, requiring cross-modal learning to translate visual motions into text. Current gloss-based methods employ gloss annotations for translation. Due to the availability of annotated sign language video data, gloss-based methods rely on labor-intensive and high-quality annotation work for sign language videos. To tackle this issue, we introduce a novel two-stage gloss-free sign language translation model with a parameter-free visual-language pre-training method, enhancing visual and semantic representations without introducing extra parameters. The proposed two-stage model involves: (i) During the pre-training stage, integrating Contrastive Language-Image Pre-training (CLIP) is adopted to align visual and textual features, which are then aggregated using a mean pooling mechanism; (ii) For the fine-tuning stage, parameters from the pre-trained model are inherited to enhance sign language translation. Our proposed model surpasses the leading gloss-free SLT model on PHOENIX-2014T across various n-gram levels in the BLEU score.

Original languageEnglish
Title of host publicationProceedings of the 6th ACM International Conference on Multimedia in Asia Workshops, MMAsia 2024 Workshops
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400713149
DOIs
Publication statusPublished - 26 Dec 2024
Externally publishedYes
Event6th ACM International Conference on Multimedia in Asia Workshops, MMAsia 2024 Workshops - Auckland, New Zealand
Duration: 3 Dec 20246 Dec 2024

Publication series

NameProceedings of the 6th ACM International Conference on Multimedia in Asia Workshops, MMAsia 2024 Workshops

Conference

Conference6th ACM International Conference on Multimedia in Asia Workshops, MMAsia 2024 Workshops
Country/TerritoryNew Zealand
CityAuckland
Period3/12/246/12/24

Keywords

  • Contrastive Language-Image Pre-training (CLIP)
  • Gloss-free
  • Sign Language Translation

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Free-FreeSLT: A Gloss-Free Parameter-Free model for Sign Language Translation'. Together they form a unique fingerprint.

Cite this