Abstract
The development of autonomous driving systems (ADSs) represents a significant evolution in transportation, with simulation being a crucial factor in their development. However, one of the challenges for these ADS-related systems, due to their complexity and critical nature, is the oracle problem. In software testing, an oracle is a mechanism used to systematically verify the correctness of the outputs for any given input. The oracle problem arises when it is difficult or impossible to find or use such an oracle. Metamorphic testing (MT) has proven to be an effective approach to alleviate the oracle problem. It uses metamorphic relations (MRs)---relations among multiple inputs and their corresponding outputs---to verify test results. The process of generating MRs remains a significant challenge in the application of MT, especially in complex systems like ADSs. Traditionally, MRs are produced by domain experts, while approaches like metamorphic relation patterns (MRPs) and metamorphic relation input patterns (MRIPs) provide a structured approach to the generation of MRs. Additionally, the rise of large language models (LLMs), such as ChatGPT, provides opportunities to reduce the manual effort involved in the MT process. Metamorphic exploration (ME), as an extension of MT, employs hypothesized MRs (HMRs) to assess the system under test (SUT). A violation of these relations may not indicate a defect but can reveal the user's misunderstanding of the SUT, prompting further exploration of the system.This thesis aims to enhance MT in ADS testing by alleviating the oracle problem in these systems, improving testing efficiency, and promoting educational practices. Through a series of experiments, it demonstrates how MT can be effectively applied to tackle challenges in ADS testing. Key contributions include applying ME and MT to the Baidu Apollo ADS, which leads to enhanced system comprehension and the identification of conflicting obstacle-detection results in the perception-camera module. An ADS-based test harness is designed to improve testing efficiency and is validated through an industry case study.
While MT is proven effective in testing ADSs, it still heavily depends on human knowledge, particularly in the MR generation process. This thesis introduces the use of ChatGPT for generating and evaluating MRs, along with a set of evaluation criteria for objective assessments of MR quality. A GPT evaluator was also developed, demonstrating AI's potential to assist beginners and enhance MT practices. Comparative studies with human-generated MRs further underscore the potential of LLMs and highlight areas for educational improvement. Additionally, an Open Educational Resource (OER) is introduced to provide guidelines and templates for teaching beginners about scenario and MR generation.
To address the challenges of identifying whether anomalies originate from the ADS or the simulator, this thesis applies MT to Autonomous Driving (AD) simulators to enhance ADS testing, revealing critical bugs in NIO and CARLA simulators, and demonstrating MT’s effectiveness in ensuring simulator reliability. A scenario-driven MT framework integrating ME and MT is proposed to enhance defect discovery and reporting, and is validated through an industry case study. Additionally, MRPs and MRIPs tailored for AD systems are introduced, enabling effective defect detection. These are further incorporated into a human-AI hybrid MT framework with a test harness to streamline MR generation and automate test case execution, enhancing testing efficiency.
In summary, through practical experiments and methodological improvements, the thesis fills significant gaps in MT and ADS testing. It demonstrates the practical utility of MT in ADS testing, introduces templates, integrates AI into the MT process, and develops educational resources along with advanced frameworks and tools. These contributions enhance testing efficiency, reliability, and educational practices in the field of MT and ADSs.
Date of Award | 13 Jul 2025 |
---|---|
Original language | English |
Awarding Institution |
|
Supervisor | Dave Towey (Supervisor), Matthew Pike (Supervisor) & Xu Sun (Supervisor) |
Keywords
- autonomous driving systems
- Metamorphic testing