Int J Performability Eng ›› 2026, Vol. 22 ›› Issue (3): 167-177.doi: 10.23940/ijpe.26.03.p6.167177

• Original article • Previous Articles    

Adaptive Ensemble Learning for Software Defect Prediction with Imbalanced Data

Ashu Mehta*   

  1. Department of Computer Science and Engineering, Lovely Professional University, Punjab, India
  • Submitted on ; Revised on ; Accepted on
  • Contact: Ashu Mehta
  • About author:
    * Corresponding author.
    E-mail address: ashu.23631@lpu.co.in

Abstract:

Software Fault Prediction (SFP) plays a very crucial role in improving software reliability by facilitating the early detection of modules prone to defects. Nevertheless, ongoing issues like extreme imbalance in classes and unstable performance of the classifiers on the heterogeneous datasets deter the efficiency of current methods. To address these problems, in this paper, a stability-conscious meta-ensemble learning architecture is proposed combining adaptive sampling with meta-level classifier fusion. Contrary to traditional ensemble-based approaches that rely on resampling and fixed combinations of models, the presented architecture dynamically chooses the appropriate sampling techniques to rely on the properties of the data and trains the best combination of classifiers with the help of a meta-learner. Wide experiments performed on benchmark datasets of PROMISE, NASA, AEEEM, ReLink, and SoftLab indicate that there is a consistent improvement in performance compared to baseline ensemble models with better AUC, MCC, and G-mean. Moreover, the experiments of the cross-project fault prediction prove high generalization and low deterioration of performance. The statistical significance tests such as Wilcoxon Signed-Rank Test, Cliff- Delta, and Nemenyi post-hoc tests confirm the strength of the suggested method. In general, the framework offers a practical and generalizable method of resolving the issues of class imbalance and performance instability in the real-world software fault prediction.

Key words: software fault prediction, meta-ensemble learning, class imbalance, adaptive sampling, cross-project prediction