Token pyramid pooling-driven style adapter learning with dual-view balanced loss for imbalanced diabetic retinopathy grading

Jilu Zhao, Xiaoqing Zhang, Jiawei Zhang, Hanxi Sun, Qiushi Nie, Zunjie Xiao, Linxia Xiao, Fengyun Zhang, Yan Hu, Jiang Liu

Research output: Journal PublicationArticlepeer-review

Abstract

Precise diabetic retinopathy (DR) grading is essential for developing personalized and effective treatment plans. Although deep neural networks (DNNs) have achieved promising DR grading results, constructing a precise and trustworthy DR grading model remains challenging due to limited high-quality medical image data, high computational costs, and imbalanced data distributions. To tackle these challenges, we explore the transferability of feature representations from pre-trained vision foundation models (VFMs) to fundus images through adapter learning, aiming to build an efficient imbalanced DR grading model. Unlike classical full-tuning, which fine-tunes all pre-trained parameters of VFMs, adapter learning achieves competitive performance by adding negligible fine-tuned parameter number. Motivated by the above analysis, we develop a Token Pyramid Pooling-Driven Style Adapter Learning (TPDSAL) to better capture task-specific feature representations from VFMs, which fully exploits pathological distribution prior of DR and the inherent fundus imaging characteristics. Besides, we propose a novel dual-view balanced loss (DVB) to improve imbalanced DR grading performance and trustworthiness, which explores the potential of training class frequencies in sample-wise predicted logit space and sample-wise loss value space simultaneously. Extensive experiments on four public fundus image datasets manifest the superiority of our TPDSAL with DVB over competitive transfer tuning and loss methods in terms of imbalanced grading performance and trustworthiness. Further analysis suggests that clinical prior knowledge utilization is beneficial for adapter learning in capturing task-specific feature representations from VFMs.

Original languageEnglish
Article number112194
JournalPattern Recognition
Volume171
DOIs
Publication statusPublished - Mar 2026

Keywords

  • Adapter learning
  • Dual-view balanced loss
  • Imbalanced DR grading
  • Pre-trained vision foundation models
  • Transferability
  • Trustworthniess

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Token pyramid pooling-driven style adapter learning with dual-view balanced loss for imbalanced diabetic retinopathy grading'. Together they form a unique fingerprint.

Cite this