Stacking-based heterogeneous genetic programming for interpretable credit risk evaluation

Research output: Journal PublicationArticlepeer-review

Abstract

The development of advanced ensemble models to handle complex and large-scale datasets has become a central focus in credit risk prediction. Although ensemble methods offer strong predictive performance, stacking lacks a standardized construction pipeline and its complex structure often reduces transparency and robustness. To address these challenges, this study proposes a stacking based heterogeneous genetic programming classifier (SH-GPC), an end to end pipeline in which genetic programming serves as the meta-classifier to enhance interpretability and generalization. Through experiments on a large-scale credit risk dataset comprising millions of records, SH-GPC is shown to significantly outperform conventional homogeneous ensemble methods and several emerging models, including XGBoost, LightGBM, NGBoost, and TabNet, in terms of AUC. Compared to stacking frameworks with logistic regression or XGBoost as the meta-classifier, SH-GPC achieves better predictive accuracy while relying on a smaller number of base classifiers, thereby improving simplicity and interpretability. Transparency is further enhanced by representing the GP meta-classifier as evolved symbolic expressions and syntax trees. Additionally, the incorporation of the Shapley additive explanations (SHAP) technique enables visualization and attribution of each base classifier's contribution, offering insights into the model's internal decision logic. This study demonstrates the applicability of evolutionary algorithms in ensemble learning and introduces a new framework for credit risk modeling that achieves a balance between accuracy, stability, and interpretability.

Original languageEnglish
Article number114214
JournalApplied Soft Computing
Volume186
DOIs
Publication statusPublished - Jan 2026

Free Keywords

  • Credit risk prediction
  • Genetic programming
  • Heterogeneous stacking
  • Natural gradient boosting
  • Shapley additive explanations

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Stacking-based heterogeneous genetic programming for interpretable credit risk evaluation'. Together they form a unique fingerprint.

Cite this