Int J Performability Eng ›› 2025, Vol. 21 ›› Issue (2): 94-103.doi: 10.23940/ijpe.25.02.p4.94103

• Original article • Previous Articles     Next Articles

Addressing Class Imbalance in Software Fault Prediction using BVPC-SENN: A Hybrid Ensemble Approach

Ashu Mehtaa,b,*, Navdeep Kaurb, and Amandeep Kaurc   

  1. aDepartment of Computer Science and Engineering, Lovely Professional University, Phagwara, India;
    bDepartment of Computer Science, Sri Guru Granth Sahib World University, Fatehgarh Sahib, India;
    cDepartment of Computer Engineering, NIT Kurukshetra, Haryana, India
  • Submitted on ; Revised on ; Accepted on
  • Contact: *E-mail address: ashu.23631@lpu.co.in

Abstract: Software fault prediction plays a crucial role in maintaining software quality by identifying modules that are prone to defects early in the development cycle. Nevertheless, issues including class imbalance, high-dimensional data, and the shortcomings of individual classifiers make prediction models less successful. This paper addresses these issues by putting up a new Balanced Voting-PCA Classifier with SMOTE-ENN (BVPC-SENN) model. The BVPC-SENN model incorporates a weighted voting ensemble of Bernoulli Naive Bayes (BNB), Gaussian Naive Bayes (GNB), Random Forest (RF), and Support Vector Machines (SVM) as base classifiers, to handle class imbalance, and Principal Component Analysis (PCA) for dimensionality reduction. In order to provide reliable and accurate fault predictions, the BVPC-SENN model balances the dataset, reduces feature dimensions, and combines the predictions of many classifiers using a weighted voting method. Experiments on several benchmark datasets show that the BVPC-SENN model achieves improved accuracy, precision, and generalization, greatly enhancing prediction performance. The suggested methodology improves the state-of-the-art in software fault prediction and provides a useful framework for strengthening software quality assurance procedures by successfully addressing class imbalance, optimizing feature representation, and utilizing ensemble learning.

Key words: class imbalance, ensemble learning, feature selection, SMOTE, BVPC-SENN