Validating autonomous driving: metamorphic testing for autonomous vehicles and simulators

Yifan Zhang

School of Computer Science

Student thesis: PhD Thesis

Abstract

The development of autonomous driving systems (ADSs) represents a significant evolution in transportation, with simulation being a crucial factor in their development. However, one of the challenges for these ADS-related systems, due to their complexity and critical nature, is the oracle problem. In software testing, an oracle is a mechanism used to systematically verify the correctness of the outputs for any given input. The oracle problem arises when it is difficult or impossible to find or use such an oracle. Metamorphic testing (MT) has proven to be an effective approach to alleviate the oracle problem. It uses metamorphic relations (MRs)---relations among multiple inputs and their corresponding outputs---to verify test results. The process of generating MRs remains a significant challenge in the application of MT, especially in complex systems like ADSs. Traditionally, MRs are produced by domain experts, while approaches like metamorphic relation patterns (MRPs) and metamorphic relation input patterns (MRIPs) provide a structured approach to the generation of MRs. Additionally, the rise of large language models (LLMs), such as ChatGPT, provides opportunities to reduce the manual effort involved in the MT process. Metamorphic exploration (ME), as an extension of MT, employs hypothesized MRs (HMRs) to assess the system under test (SUT). A violation of these relations may not indicate a defect but can reveal the user's misunderstanding of the SUT, prompting further exploration of the system.

This thesis aims to enhance MT in ADS testing by alleviating the oracle problem in these systems, improving testing efficiency, and promoting educational practices. Through a series of experiments, it demonstrates how MT can be effectively applied to tackle challenges in ADS testing. Key contributions include applying ME and MT to the Baidu Apollo ADS, which leads to enhanced system comprehension and the identification of conflicting obstacle-detection results in the perception-camera module. An ADS-based test harness is designed to improve testing efficiency and is validated through an industry case study.

While MT is proven effective in testing ADSs, it still heavily depends on human knowledge, particularly in the MR generation process. This thesis introduces the use of ChatGPT for generating and evaluating MRs, along with a set of evaluation criteria for objective assessments of MR quality. A GPT evaluator was also developed, demonstrating AI's potential to assist beginners and enhance MT practices. Comparative studies with human-generated MRs further underscore the potential of LLMs and highlight areas for educational improvement. Additionally, an Open Educational Resource (OER) is introduced to provide guidelines and templates for teaching beginners about scenario and MR generation.

To address the challenges of identifying whether anomalies originate from the ADS or the simulator, this thesis applies MT to Autonomous Driving (AD) simulators to enhance ADS testing, revealing critical bugs in NIO and CARLA simulators, and demonstrating MT’s effectiveness in ensuring simulator reliability. A scenario-driven MT framework integrating ME and MT is proposed to enhance defect discovery and reporting, and is validated through an industry case study. Additionally, MRPs and MRIPs tailored for AD systems are introduced, enabling effective defect detection. These are further incorporated into a human-AI hybrid MT framework with a test harness to streamline MR generation and automate test case execution, enhancing testing efficiency.

In summary, through practical experiments and methodological improvements, the thesis fills significant gaps in MT and ADS testing. It demonstrates the practical utility of MT in ADS testing, introduces templates, integrates AI into the MT process, and develops educational resources along with advanced frameworks and tools. These contributions enhance testing efficiency, reliability, and educational practices in the field of MT and ADSs.

Date of Award	13 Jul 2025
Original language	English
Awarding Institution	University of Nottingham
Supervisor	Dave Towey (Supervisor), Matthew Pike (Supervisor) & Xu Sun (Supervisor)

Keywords

autonomous driving systems
Metamorphic testing

Cite this

Documents

20250515.PhD_Thesis_source
File: application/pdf, 40.4 MB
Type: Thesis-as examined

Validating autonomous driving: metamorphic testing for autonomous vehicles and simulators

Abstract

Keywords

Cite this

Documents

Related content

Research output

Metamorphic testing harness for the Baidu Apollo perception-camera module

Metamorphic Testing of a Steer-by-Wire System: An Intercultural Students-as-Partners Collaboration Experience

Enabling Effective Metamorphic-Relation Generation by Novice Testers: A Pilot Study

Metamorphic testing of an automated parking system: an experience report

MT4NS: Metamorphic Testing for Network Scanning

Preparing Future SQA Professionals: An Experience Report of Metamorphic Exploration of an Autonomous Driving System

Scenario-Driven Metamorphic Testing for Autonomous Driving Simulators

No pain, no gain: the necessary initial struggles to enable doctoral research work

Automated metamorphic-relation generation with ChatGPT: an experience report

Enhancing ADS Testing: An Open Educational Resource for Metamorphic Testing