Enhancing autonomous driving simulations: A hybrid metamorphic testing framework with metamorphic relations generated by GPT

Yifan Zhang, Tsong Yueh Chen, Matthew Pike, Dave Towey, Zhihao Ying, Zhi Quan Zhou

Research output: Journal PublicationArticlepeer-review

Abstract

Context: Autonomous Driving Systems (ADSs) have rapidly developed over the past decade. Given the complexity and cost of testing ADSs, advanced simulation tools like the CARLA simulator are essential for efficient algorithm development and validation. However, the intricacies of autonomous driving (AD) simulations pose challenges for software testing, particularly the oracle problem, which relates to the difficulty in determining the correctness of outputs within reasonable timeframes. While many studies validate ADS algorithms using simulations, few address the validity of the simulated data, a fundamental premise for ADS testing. Objective: This study addresses the oracle problem in AD simulations by employing Metamorphic Testing (MT) and Metamorphic Relations (MRs) to detect software defects in the CARLA simulator. Additionally, we explore AI-driven approaches, specifically integrating ChatGPT's customizable features to enhance MR generation and refinement. Method: We propose a human-AI hybrid MT framework that combines human inputs with AI-driven automation to generate and refine MRs. The framework uses the GPT-MR generator, a customized large language model (LLM) based on Metamorphic Relation Patterns (MRPs) and ChatGPT, to produce MRs according to user specifications. These MRs are then refined by MT experts and fed into a test harness, automating test-case creation and execution while supporting diverse parameter inputs. Results: The GPT-MR generator produced effective MRs, leading to the discovery of four significant defects in the CARLA simulator, demonstrating their effectiveness in identifying software flaws. The test harness enabled efficient, automated testing across multiple modules and vehicle-control approaches, which enhanced the robustness and efficiency of our methods. Conclusions: Our study highlights the effectiveness of MT and MRPs in addressing the oracle problem for AD simulations, enhancing software reliability, and ensuring robust validation processes. The combination of AI-driven tools and human knowledge offers a structured methodology for validating simulated data and ADS performance, contributing to more reliable ADS development and testing.

Original languageEnglish
Article number107828
JournalInformation and Software Technology
Volume187
DOIs
Publication statusPublished - Nov 2025

Keywords

  • Autonomous driving (AD)
  • ChatGPT
  • Large language models (LLMs)
  • Metamorphic relation (MR)
  • Metamorphic relation pattern (MRP)
  • Metamorphic testing (MT)
  • Oracle problem
  • Simulation
  • Test harness

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Enhancing autonomous driving simulations: A hybrid metamorphic testing framework with metamorphic relations generated by GPT'. Together they form a unique fingerprint.

Cite this