Resampling Techniques Study on Class Imbalance Problem in Credit Risk Prediction

Zixue Zhao, Tianxiang Cui, Shusheng Ding, Jiawei Li, Anthony Graham Bellotti

Research output: Journal PublicationArticlepeer-review

1 Citation (Scopus)


Credit risk prediction heavily relies on historical data provided by financial institutions. The goal is to identify commonalities among defaulting users based on existing information. However, data on defaulters is often limited, leading to a concentration of credit data where positive samples (defaults) are significantly fewer than negative samples (nondefaults). It poses a serious challenge known as the class imbalance problem, which can substantially impact data quality and predictive model effectiveness. To address the problem, various resampling techniques have been proposed and studied extensively. However, despite ongoing research, there is no consensus on the most effective technique. The choice of resampling technique is closely related to the dataset size and imbalance ratio, and its effectiveness varies across different classifiers. Moreover, there is a notable gap in research concerning suitable techniques for extremely imbalanced datasets. Therefore, this study aims to compare popular resampling techniques across different datasets and classifiers while also proposing a novel hybrid sampling method tailored for extremely imbalanced datasets. Our experimental results demonstrate that this new technique significantly enhances classifier predictive performance, shedding light on effective strategies for managing the class imbalance problem in credit risk prediction.

Original languageEnglish
Article number701
Issue number5
Publication statusPublished - Mar 2024


  • class imbalance
  • credit risk prediction
  • resampling

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • General Mathematics
  • Engineering (miscellaneous)


Dive into the research topics of 'Resampling Techniques Study on Class Imbalance Problem in Credit Risk Prediction'. Together they form a unique fingerprint.

Cite this