LViTE: A Lightweight Vision Transformer with Ensemble Classification for Sign Language Recognition

Edmond Li Ren Ewe, Chin Poo Lee, Kian Ming Lim, Lee Chung Kwek, Heng Siong Lim

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

Sign language recognition is essential for human-machine interaction, supporting communication for individuals with hearing and speech impairments. However, challenges remain due to variability in hand shapes, orientations, motion dynamics, and environmental factors such as lighting and occlusion. Moreover, many existing models are computationally intensive, limiting their applicability in resource-constrained settings. This paper introduces the Lightweight Vision Transformer with Ensemble Classification (LViTE), a streamlined framework that balances accuracy and efficiency. LViTE employs a reduced Vision Transformer backbone with fewer encoder layers and attention heads to lower computational cost, while an ensemble-based classification mechanism enhances robustness through aggregated predictions from multiple decision trees. Evaluated on three benchmark datasets - American Sign Language (ASL), ASL with Digits, and NUS Hand Posture - LViTE achieves state-of-the-art accuracies of 99.98%, 99.98%, and 99.97%, respectively. These results demonstrate LViTE's effectiveness and suitability for real-time deployment in human-machine systems where both performance and efficiency are critical.

Original languageEnglish
Title of host publication2025 International Conference on Information and Communication Technology, ICoICT 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331503239
DOIs
Publication statusPublished - 2025
Event2025 International Conference on Information and Communication Technology, ICoICT 2025 - Hybrid, Bandung, Indonesia
Duration: 30 Jul 202531 Jul 2025

Publication series

Name2025 International Conference on Information and Communication Technology, ICoICT 2025

Conference

Conference2025 International Conference on Information and Communication Technology, ICoICT 2025
Country/TerritoryIndonesia
CityHybrid, Bandung
Period30/07/2531/07/25

Keywords

  • deep learning
  • human-machine
  • machine learning
  • Sign language recognition
  • vision transformer

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'LViTE: A Lightweight Vision Transformer with Ensemble Classification for Sign Language Recognition'. Together they form a unique fingerprint.

Cite this