Mask-Aware Hierarchical Aggregation Transformer for Occluded Person Re-identification

Guoqing Zhang, Yan Yang, Yuhui Zheng, Gaven Martin, Ruili Wang

Research output: Journal PublicationArticlepeer-review

Abstract

Occluded person re-identification (Re-ID) is a challenging problem due to the absence of notable discriminative features resulting from incomplete body part images and interference from occluded regions. Recently, some transformer-based methods have demonstrated excellent capabilities in resolving this problem, however these methods are not able to precisely focus on the non-occluded body parts and cannot capture fine-grained local features. To achieve these we propose a Mask-Aware Hierarchical Aggregation TrAnsforMer (MAHATMA) method to enhance occluded person Re-ID. Specifically, we propose a Mask Information Embedding (MIE) module, which directs the model to focus on non-occluded body parts by incorporating the mask semantic information of a human body. Furthermore, to effectively capture fine-grained local features, we propose a Hierarchical Feature Aggregation (HFA) module that mines more exploitable high-quality detail information by aggregating hierarchical image patch representations. To further alleviate the feature loss problem, we propose a Diverse Feature Completion (DFC) module, which is able to complete global features through multi-path feature integration. Extensive experimental evaluations demonstrate that our method exhibits superior performance in dealing with occluded and holistic person datasets.

Original languageEnglish
JournalIEEE Transactions on Circuits and Systems for Video Technology
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • feature aggregation
  • feature completion
  • Occluded person re-identification
  • vision transformer

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Mask-Aware Hierarchical Aggregation Transformer for Occluded Person Re-identification'. Together they form a unique fingerprint.

Cite this