Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (7): 1391-1400.doi: 10.23940/ijpe.18.07.p2.13911400

• Original articles • Previous Articles     Next Articles

Mixed Weighted KNN for Imbalanced Datasets

Qimin Caoa, Lei Lab, Hongxia Liua, and Si Hana   

  1. aLibrary, China University of Political Science and Law, Beijing, 100088, China
    bSchool of information technology & management, University of International Business and Economic, Beijing, 100029, China

Abstract:

It is well known that imbalanced datasets are a common phenomenon and will reduce the accuracy of classification. For solving the class imbalance problem, this paper proposed the mixed weighted KNN algorithm. According to the imbalance between the classes, this algorithm assigns each sample of datasets an inverse proportion weight, and then it combines with the distance weight, making the weight of the training sample close to the test sample greater. In order to improve the operating efficiency and make it easy to handle massive datasets, we implemented the parallelism of MW-KNN based on the Hadoop framework. Experimental results show that the proposed algorithm is simple and effective.


Submitted on April 13, 2018; Revised on May 25, 2018; Accepted on June 25, 2018
References: 21