Int J Performability Eng ›› 2025, Vol. 21 ›› Issue (5): 235-248.doi: 10.23940/ijpe.25.05.p1.235248

    Next Articles

Fusion Mutation-Based Test Generation and XGBoost-Driven Prioritization for Image Classification DNNs

Qian Zhanga and Dongcheng Lib,*   

  1. aSchool of Computer Science, China University of Geosciences, Wuhan, China;
    bDepartment of Computer Science, California State Polytechnic University - Humboldt, Arcata, USA
  • Submitted on ; Revised on ; Accepted on
  • Contact: * E-mail address: dl313@humboldt.edu

Abstract: With deep learning increasingly employed in safety-critical domains, ensuring the reliability of deep neural networks has become paramount. Although traditional software testing can detect model errors, the substantial costs of assembling large, manually annotated test sets remain a key challenge. To address this, we propose: (1) a fusion mutation-based test case generation technique and (2) a test case prioritization algorithm based on feature analysis. The fusion mutation method enriches test diversity through both data mutation and model mutation. By designing a hyperparameter optimization space for image distortion and employing an improved Bayesian optimization algorithm, our approach rapidly identifies optimal mutation parameters and adaptively generates test sets from minimal data. These mutated images simulate various distortion scenarios, forming the basis for priority ranking. The priority sorting algorithm leverages differential, rule, and effectiveness features, combined with an XGBoost-based strategy that prioritizes the most error-prone test cases and restricts ineffective mutations. This ensures expedited identification of potential DNN defects, improving testing efficiency. Experiments using popular image classification networks on multiple datasets demonstrate that our method outperforms other state-of-the-art approaches in 50% of tested scenarios, achieving a 2%-9.2% performance gain. These findings validate our method’s effectiveness in uncovering diverse error types in DNNs and generating high-quality test sets while maintaining a balance between test data efficiency and diversity.

Key words: deep neural network, test prioritization, test generation, fusion mutation, image classification