Skip to main navigation Skip to search Skip to main content

Weakly Supervised Salient Object Detection with Text Supervision

  • Zhihao Wu*
  • , Jie Wen
  • , Linlin Shen
  • , Xiaopeng Fan
  • , Yong Xu*
  • , Jian Yang
  • , David Zhang
  • *Corresponding author for this work

Research output: Journal PublicationArticlepeer-review

Abstract

Weakly supervised salient object detection using image-category supervision offers a cost-effective alternative to dense annotations, yet suffers from significant performance degradation. This is primarily attributed to the limitations of existing pseudo-label generation methods, which tend to either under- or over-activate object regions and indiscriminately label all non-activated pixels as background, introducing considerable label noise. Furthermore, these methods are restricted in the ability to capture objects beyond the pre-trained category set. To overcome these challenges, we propose a CLIP-based pseudo-label generation that exploits text prompts to jointly activate generic background and salient objects, breaking the dependency on specific categories. However, we find that this paradigm faces three challenges: optimal prompt uncertainty, background redundancy, and object-background conflict. To mitigate these, we propose three key modules. First, spatial distribution-guided prompt selection evaluates the spatial distribution of activation regions to identify the optimal prompt. Second, center and scale prior-guided activation refinement integrates self-attention and superpixel cues to suppress background noise. Third, learning feedback-guided pseudo-label update learns saliency knowledge from other pseudo-labels to resolve conflicting regions and iteratively refine supervision. Extensive experiments demonstrate that our method surpasses previous weakly supervised methods with image-category supervision and unsupervised approaches.

Original languageEnglish
Article number74
JournalInternational Journal of Computer Vision
Volume134
Issue number2
DOIs
Publication statusPublished - Feb 2026
Externally publishedYes

Free Keywords

  • Language-vision large model
  • Salient object detection
  • Unsupervised learning
  • Weakly supervised learning

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Weakly Supervised Salient Object Detection with Text Supervision'. Together they form a unique fingerprint.

Cite this