Knowledge-sharing hierarchical memory fusion network for scribble-supervised video salient object detection

Tao Jiang, Feng Hou, Yi Wang, Guangzhu Chen, Ruili Wang

Research output: Journal PublicationArticlepeer-review

Abstract

Scribble annotations offer a practical alternative to pixel-wise labels in video salient object detection (V-SOD). However, their sparse foreground coverage and ambiguous boundaries introduce background interference and error propagation, degrading segmentation accuracy across frames. To address this issue, we propose a novel Knowledge-sharing Hierarchical Memory Fusion Network (KHMF-Net) for scribble-supervised V-SOD. The core of our framework is a Hierarchical Memory Bank (HMB) that stores initial scribbles, historical high-confidence regions, and historical full salient maps, enabling long-term spatiotemporal context modeling to suppress error propagation. Additionally, we introduce an Adaptive Memory Fusion (AMF) module to dynamically integrate multi-confidence features, providing reliable guidance during salient mask expansion. To address background interference, we design an Interactive Equalized Matching (IEM) module with reference-wise softmax, ensuring balanced contributions from reference frame pixels. A dual-attention knowledge-sharing mechanism is further proposed to enhance IEM by transferring high-performance attention features from a Teacher to a Student module, improving segmentation accuracy. Experimental results demonstrate that KHMF-Net's hierarchical memory architecture and effective background-target discrimination enable state-of-the-art performance on three scribble-annotated datasets, even exceeding some fully supervised approaches. The module and predicted maps are publicly available at https://github.com/TOMMYWHY/KHMF-Net.

Original languageEnglish
Pages (from-to)177-183
Number of pages7
JournalPattern Recognition Letters
Volume196
DOIs
Publication statusPublished - Oct 2025

Keywords

  • Knowledge-sharing
  • Scribble-supervised
  • Video salient object detection
  • Weakly supervised

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Knowledge-sharing hierarchical memory fusion network for scribble-supervised video salient object detection'. Together they form a unique fingerprint.

Cite this