Spatial Context-Aware Object-Attentional Network for Multi-Label Image Classification

Jialu Zhang; Jianfeng Ren; Qian Zhang; Jiang Liu; Xudong Jiang

doi:10.1109/TIP.2023.3266161

Spatial Context-Aware Object-Attentional Network for Multi-Label Image Classification

Jialu Zhang, Jianfeng Ren, Qian Zhang, Jiang Liu, Xudong Jiang

School of Computer Science

Research output: Journal Publication › Article › peer-review

32 Citations (Scopus)

Abstract

Multi-label image classification is a fundamental but challenging task in computer vision. To tackle the problem, the label-related semantic information is often exploited, but the background context and spatial semantic information of related objects are not fully utilized. To address these issues, a multi-branch deep neural network is proposed in this paper. The first branch is designed to extract the discriminant information from regions of interest to detect target objects. In the second branch, a spatial context-aware approach is proposed to better capture the contextual information of an object in its surroundings by using an adaptive patch expansion mechanism. It helps the detection of small objects that are easily lost without the support of context information. The third one, the object-attentional branch, exploits the spatial semantic relations between the target object and its related objects, to better detect partially occluded, small or dim objects with the support of those easily detectable objects. To better encode such relations, an attention mechanism jointly considering the spatial and semantic relations between objects is developed. Two widely used benchmark datasets for multi-labeling classification, MS COCO and PASCAL VOC, are used to evaluate the proposed framework. The experimental results demonstrate that the proposed method outperforms the state-of-the-art methods for multi-label image classification.

Original language	English
Pages (from-to)	3000-3012
Number of pages	13
Journal	IEEE Transactions on Image Processing
Volume	32
DOIs	https://doi.org/10.1109/TIP.2023.3266161
Publication status	Published - 2023

Keywords

Multi-label image classification
adaptive patch expansion
object clustering
spatial context-aware object detection
spatial semantic attention

ASJC Scopus subject areas

Software
Computer Graphics and Computer-Aided Design

Access to Document

10.1109/TIP.2023.3266161

Cite this

@article{f0027deeb0204893828ad66001a5074b,

title = "Spatial Context-Aware Object-Attentional Network for Multi-Label Image Classification",

abstract = "Multi-label image classification is a fundamental but challenging task in computer vision. To tackle the problem, the label-related semantic information is often exploited, but the background context and spatial semantic information of related objects are not fully utilized. To address these issues, a multi-branch deep neural network is proposed in this paper. The first branch is designed to extract the discriminant information from regions of interest to detect target objects. In the second branch, a spatial context-aware approach is proposed to better capture the contextual information of an object in its surroundings by using an adaptive patch expansion mechanism. It helps the detection of small objects that are easily lost without the support of context information. The third one, the object-attentional branch, exploits the spatial semantic relations between the target object and its related objects, to better detect partially occluded, small or dim objects with the support of those easily detectable objects. To better encode such relations, an attention mechanism jointly considering the spatial and semantic relations between objects is developed. Two widely used benchmark datasets for multi-labeling classification, MS COCO and PASCAL VOC, are used to evaluate the proposed framework. The experimental results demonstrate that the proposed method outperforms the state-of-the-art methods for multi-label image classification.",

keywords = "Multi-label image classification, adaptive patch expansion, object clustering, spatial context-aware object detection, spatial semantic attention",

author = "Jialu Zhang and Jianfeng Ren and Qian Zhang and Jiang Liu and Xudong Jiang",

note = "Publisher Copyright: {\textcopyright} 1992-2012 IEEE.",

year = "2023",

doi = "10.1109/TIP.2023.3266161",

language = "English",

volume = "32",

pages = "3000--3012",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Spatial Context-Aware Object-Attentional Network for Multi-Label Image Classification

AU - Zhang, Jialu

AU - Ren, Jianfeng

AU - Zhang, Qian

AU - Liu, Jiang

AU - Jiang, Xudong

PY - 2023

Y1 - 2023

N2 - Multi-label image classification is a fundamental but challenging task in computer vision. To tackle the problem, the label-related semantic information is often exploited, but the background context and spatial semantic information of related objects are not fully utilized. To address these issues, a multi-branch deep neural network is proposed in this paper. The first branch is designed to extract the discriminant information from regions of interest to detect target objects. In the second branch, a spatial context-aware approach is proposed to better capture the contextual information of an object in its surroundings by using an adaptive patch expansion mechanism. It helps the detection of small objects that are easily lost without the support of context information. The third one, the object-attentional branch, exploits the spatial semantic relations between the target object and its related objects, to better detect partially occluded, small or dim objects with the support of those easily detectable objects. To better encode such relations, an attention mechanism jointly considering the spatial and semantic relations between objects is developed. Two widely used benchmark datasets for multi-labeling classification, MS COCO and PASCAL VOC, are used to evaluate the proposed framework. The experimental results demonstrate that the proposed method outperforms the state-of-the-art methods for multi-label image classification.

AB - Multi-label image classification is a fundamental but challenging task in computer vision. To tackle the problem, the label-related semantic information is often exploited, but the background context and spatial semantic information of related objects are not fully utilized. To address these issues, a multi-branch deep neural network is proposed in this paper. The first branch is designed to extract the discriminant information from regions of interest to detect target objects. In the second branch, a spatial context-aware approach is proposed to better capture the contextual information of an object in its surroundings by using an adaptive patch expansion mechanism. It helps the detection of small objects that are easily lost without the support of context information. The third one, the object-attentional branch, exploits the spatial semantic relations between the target object and its related objects, to better detect partially occluded, small or dim objects with the support of those easily detectable objects. To better encode such relations, an attention mechanism jointly considering the spatial and semantic relations between objects is developed. Two widely used benchmark datasets for multi-labeling classification, MS COCO and PASCAL VOC, are used to evaluate the proposed framework. The experimental results demonstrate that the proposed method outperforms the state-of-the-art methods for multi-label image classification.

KW - Multi-label image classification

KW - adaptive patch expansion

KW - object clustering

KW - spatial context-aware object detection

KW - spatial semantic attention

UR - http://www.scopus.com/inward/record.url?scp=85159840794&partnerID=8YFLogxK

U2 - 10.1109/TIP.2023.3266161

DO - 10.1109/TIP.2023.3266161

M3 - Article

C2 - 37163392

AN - SCOPUS:85159840794

SN - 1057-7149

VL - 32

SP - 3000

EP - 3012

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

ER -

Spatial Context-Aware Object-Attentional Network for Multi-Label Image Classification

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this