Wi-ViTAL: Domain Generalization of Wireless Human Activity Recognition Using Linear Attention Vision Transformer With Adversarial Learning

    Research output: Journal PublicationArticlepeer-review

    Abstract

    The learning-based, passive, device-free wireless human activity recognition (WHAR) systems still face significant challenges, especially in real-world deployments. Environmental differences and domain diversities cause signals collected in the source domain to have a different distribution from those in the target domain, and this affects the accuracy. To achieve domain generalization (DG), a multi-scale linear attention vision transformer (ViT) based feature extractor and domain adversarial learning with Wasserstein distance are proposed. By aligning both marginal and conditional distributions across different source domains, the adversarial learning reduces the differences between trained and unseen domains. As a result, the extracted features become domain-invariant in the latent space, ensuring accuracy is preserved in new or unseen domains. Extensive evaluations using commercial IEEE 802.11ac routers with human activity data collected over different days, environments, human subjects, and obstacle configurations show that the proposed Wi-ViTAL achieves 97.57% average accuracy for five-label classification and more than 76% for eight-label classification in unseen domains. Wi-ViTAL also demonstrates an overall DG improvement compared to other recent benchmarks.

    Original languageEnglish
    JournalIEEE Transactions on Mobile Computing
    DOIs
    Publication statusPublished - Nov 2025

    Free Keywords

    • Adversarial learning
    • domain generalization
    • multi-scale linear attention
    • vision transformer
    • wireless human activity recognition

    ASJC Scopus subject areas

    • Software
    • Computer Networks and Communications
    • Electrical and Electronic Engineering

    Fingerprint

    Dive into the research topics of 'Wi-ViTAL: Domain Generalization of Wireless Human Activity Recognition Using Linear Attention Vision Transformer With Adversarial Learning'. Together they form a unique fingerprint.

    Cite this