Addressing the performance challenges of metamorphic testing

Zhihao Ying

School of Computer Science

Student thesis: PhD Thesis

Abstract

Software testing is an important process that should be considered throughout the entire life-cycle of software development: It is used to assess and assure the quality of the System Under Test (SUT). One of the fundamental problems faced in software testing is the oracle problem, which means that it is too expensive or even impossible to implement an oracle, which represents a mechanism to verify the correctness of the output or behavior of the SUT. Metamorphic Testing (MT) is a popular property-based software testing approach that has been proven to be effective in alleviating the oracle problem. As a central component of MT, Metamorphic Relations (MRs) are generally derived from necessary properties of the SUT. To implement MT, some program inputs are first generated as Source Test Cases (STCs), and then an MR can be used to generate new inputs as Follow-up Test Cases (FTCs) based on the STCs. If the actual outputs of STCs and FTCs violate the given MR, then the SUT is referred to as faulty in terms of the property related to the MR. Different from the traditional way of detecting software failures through checking the test result against an oracle, MT detects failures by verifying the MRs among STCs and FTCs as well as their relevant outputs. The STCs and their corresponding FTCs, considered as a whole, are called the Metamorphic Groups (MGs), and the MGs that violate the given MR are called MR-violating MGs.
Despite its increasing popularity, the performance of MT still needs further improvement. This thesis focuses on addressing the performance challenges of MT. Specifically, this thesis attempts to: (1) Improve MT performance (i.e. test effectiveness and efficiency) by enhancing the quality of MRs and MGs; (2) improve MT performance by addressing the problems existing in the design and application of MG-generation algorithms; and (3) improve software testing performance (for testing credit risk models) by employing MT as an additional model testing, validation and selection methodology.
First, the successful implementation of MT as well as its performance are highly dependent on the MRs and MGs, and therefore, this thesis improves the quality of MRs and MGs throughout their entire life-cycles:
• This thesis proposes new MR patterns to guide the identification of concrete MRs.
• This thesis proposes new MG-generation algorithms to generate effective MGs.
• This thesis proposes a new MR-MG pair selection algorithm to automatically and dynamically select effective MR-MG pairs for execution from existing ones.
This thesis evaluates the performance of the proposed methodologies through empirical experiments, and the experimental results indicated that they are capable of improving both the efficiency and effectiveness of MT. In addition, this thesis also introduces the concept of MR-violation regions as an additional evaluation method (for validating experimental results of different MG-generation algorithms).
Second, through the evaluation of MG-generation algorithms, this thesis identifies that previous MG-generation algorithms may encounter certain problems, which may negatively affect their performance (i.e. test effectiveness and efficiency). This thesis summarizes those situations and formally proposes the concepts of the MT-performance evaluaMachine Learning (ML) algorithms have be widely-adopted in financial services for credit risk modelling and show improved predictive performance in comparison with traditional linear models. However, the adoption of ML algorithms may raise serious validation issues, related to the inherent complexity of models. With this consideration, this thesis proposes a new perspective for testing and validating ML-based credit risk models that uses properties of the model, or properties hypothesized by users based on business rationale, to allow testers to predict how a particular change in the input should affect the output. Specifically, this thesis proposes MT as a model testing, validation and selection step, complementary to traditional model fit measures, when ML is used for credit risk modelling.
tion problem and the input-domain difference problem. This thesis also introduces methods to address these problems, with the aim of not only avoiding the existence of the same iii problems in the proposed methodologies, but also further improving the performance of previously-published algorithms.
In summary, this thesis aims to improve the effectiveness and efficiency of MT by focusing on its core parts: The MRs and MGs. The main limitation of the work is that this thesis has only proposed some methodologies specifically for each part of MT. In this context, the future work will include investigating the relationships and the connections between the proposed methodologies and propose an overall MT framework that consists of all these methodologies.

Date of Award	13 Jul 2025
Original language	English
Awarding Institution	University of Nottingham
Supervisor	Anthony Graham Bellotti (Supervisor) & Dave Towey (Supervisor)

Keywords

Software Engineering
Software Testing
Metamorphic Testing
Metamorphic Relation
Metamorphic Group
Effectiveness
Efficiency
Credit Risk

Cite this

Documents

Thesis.Zhihao Ying.20308924
File: application/pdf, 5.02 MB
Type: Thesis for reader access - any sensitive & copyright infringing material removed

Addressing the performance challenges of metamorphic testing

Abstract

Keywords

Cite this

Documents

Related content

Research output

SFIDMT-ART: A metamorphic group generation method based on Adaptive Random Testing applied to source and follow-up input domains

MT4NS: Metamorphic Testing for Network Scanning

Preparing SQA Professionals: Metamorphic Relation Patterns, Exploration, and Testing for Big Data

Using Metamorphic Relation Violation Regions to Support a Simulation Framework for the Process of Metamorphic Testing

Metamorphic Exploration for Machine Learning Validation and Model Selection

MT-PART: Metamorphic-Testing-Based Adaptive Random Testing Through Partitioning