Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (12): 3174-3183.doi: 10.23940/ijpe.18.12.p27.31743183

Previous Articles     Next Articles

Coarse-Grained Parallel AP Clustering Algorithm based on Intra-Class and Inter-Class Distance

Suzhi Zhang(), Rui Yang, and Yanan Zhao   

  1. School of Computer and Communication Engineering, Zhengzhou University of Light Industry,Zhengzhou,450002, China
  • Revised on ; Accepted on
  • Contact: Zhang Suzhi E-mail:zhsuzhi@zzuli.edu.cn

Abstract:

Affinity Propagation (AP) clustering is an algorithm based on message passing between data points, which mainly achieves clustering through the similarity between data. Compared with traditional clustering methods, the AP clustering algorithm can implement clustering without giving a predetermined number of clusters. Therefore, it has the advantages of fast and high efficiency. However, it has certain limitations in dealing with high-dimensional complex datasets. In order to improve the efficiency and accuracy of the AP clustering algorithm, a coarse-grained parallel AP clustering algorithm based on intra-class and inter-class distances is proposed: IOCAP. Firstly, the idea of granularity is introduced to divide the initial dataset into multiple subsets. Secondly, the similarity matrix is improved by combining the intra-class and inter-class distances for each subset. Finally, the improved parallel AP clustering is implemented based on the MapReduce model. Experiments on the Iris dataset, the Diabetes dataset, and the MNIST dataset show that the IOCAP algorithm has good adaptability on large datasets and can effectively improve the accuracy of the algorithm while maintaining the AP clustering effect.

Key words: AP clustering, granularity, intra-class distance, inter-class distance, parallel processing