Abstract
Most existing models for abstract visual reasoning perform poorly in compositional visual reasoning (CVR), due to complex nature of compositional rules and difficulties in distinguishing tiny rule differences between outliers and normal images. To tackle the challenges, we propose a Dual-Branch Compositional Reasoning (DBCR) model, exploiting both intra-cluster relations among the cluster of normal images and extra-cluster relations between normal images and outliers. Specifically, we design one branch of Intra-Cluster Regression Reasoning Blocks (ICR2Bs) to encapsulate common relations among normal images through hierarchical regressing reasoning, and the other branch of Contrastive Attention Reasoning Blocks (CARBs) to exploit extra-cluster differences between normal images and outliers through self-attention. Simultaneously minimizing the regression errors in ICR2Bs and maximizing the extra-cluster differences in CARBs help identify the correct cluster of normal images. Experimental results on two CVR datasets show that the proposed DBCR consistently outperforms state-of-the-art models. The code is available at https://github.com/He1mont/DBCR.
| Original language | English |
|---|---|
| Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India Duration: 6 Apr 2025 → 11 Apr 2025 |
Keywords
- Abstract Visual Reasoning
- Compositional Visual Reasoning
- Contrastive Attention Reasoning Block
- Intra-Cluster Regression Reasoning Block
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering