IENet: inheritance enhancement network for video salient object detection

Tao Jiang, Yi Wang, Feng Hou, Ruili Wang

Research output: Journal PublicationArticlepeer-review

Abstract

Effective utilization of spatiotemporal information is essential for improving the accuracy and robustness of Video Salient Object Detection (V-SOD). However, current methods have not fully utilized historical frame information, ultimately resulting in insufficient integration of complementary semantic information. To address this issue, we propose a novel Inheritance Enhancement Network (IENet) based on Transformer. The core of IENet is a Heritable Multi-Frame Attention (HMA) module, which fully exploits long-term context and frame-aware temporal modeling in feature extraction through unidirectional cross-frame enhancement. In contrast to existing methods, our heritable strategy is based on the unidirectional inheritance model using attention maps which ensure the information propagation for each frame is consistent and orderly, avoiding additional interference. Furthermore, we propose an auxiliary attention loss by using inherited attention maps to direct the network to focus more on target regions. The experimental results of our IENet reveal its effectiveness in handling challenging scenes on five popular benchmark datasets. For instance, in the cases of VOS and DAVSOD, our method achieves 0.042% and 0.070% for MAE compared to other competitive models. Particularly, IENet excels in inheriting finer details from historical frames even in complex environments. The module and predicted maps are publicly available at https://github.com/TOMMYWHY/IENet

Original languageEnglish
Pages (from-to)72007-72026
Number of pages20
JournalMultimedia Tools and Applications
Volume83
Issue number28
DOIs
Publication statusPublished - Aug 2024
Externally publishedYes

Keywords

  • Feature fusion
  • Frame-aware temporal relationships
  • Video salient object detection
  • Visual transformer

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'IENet: inheritance enhancement network for video salient object detection'. Together they form a unique fingerprint.

Cite this