Metamorphic Exploration for Machine Learning Validation and Model Selection

Research output: Contribution to conferenceAbstractpeer-review


Over recent years, machine learning algorithms have become widely adopted in financial services for credit risk modelling as a consequence of improvements in machine learning technology and data availability, and since machine learning typically shows improved predictive performance in comparison with traditional linear models. However, the adoption of machine learning raises serious validation issues, related to the models’ inherent complexity, therefore requiring testing in ways that were not required for the simpler linear models or were easier to perform for linear models. In particular, it is required that (1) decision-making using machine learning models is explainable, (2) machine learning algorithms are shown to be robust over time and different data segments, (3) models are fair and unbiassed, and additionally (4) models match business intuition. There has been a great deal of research work on the first three problems, but this study focusses on the last problem which we believe is under-researched.
This study provides a new perspective for validating credit scoring models: one that uses properties of the model, or properties hypothesized by users based on business intuition, to allow testers to predict how a particular change in the input should affect the output. Specifically, as a complement to performance measures of model fit and calibration, our new approach checks whether the output changed in the way we expect, based on prior application knowledge. For example, if credit bureau score is included as a predictor, we would expect a model to predict reduced credit risk with increasing credit score. This new approach is based on two software testing techniques, Metamorphic Testing and Metamorphic Exploration. These techniques use Metamorphic Relations that express required or expected properties of target systems, to examine inputs and outputs of multiple test cases.
We conduct a case study investigating the use of traditional evaluation metrics along with Metamorphic Exploration for credit scoring model validation. In particular, we investigate how models selected based on a traditional model fit evaluation metric fairs when evaluated using Metamorphic Exploration. Our empirical results show that these models exhibit violation of Metamorphic Relations and these violations become more extensive as the complexity of the model increases. Therefore, we propose Metamorphic Testing and Exploration as a validation and model selection step, complementary to traditional model fit measures, when machine learning is used for credit risk modelling.
Translated title of the contribution机器学习验证和模型选择的蜕变测试探索
Original languageEnglish
Number of pages1
Publication statusPublished - 1 Sept 2023
EventCredit Scoring and Credit Control Conference - University of Edinburgh Business School, Edinburgh, United Kingdom
Duration: 30 Aug 20231 Sept 2023
Conference number: 18


ConferenceCredit Scoring and Credit Control Conference
Country/TerritoryUnited Kingdom

Cite this