Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (1): 17-25.doi: 10.23940/ijpe.18.01.p3.1725

• Original articles • Previous Articles     Next Articles

A Classification Algorithm of CART Decision Tree based on MapReduce Attribute Weights

Fubao Zhu, Mengmeng Tang, Lijie Xie, and Haodong Zhu   

  1. School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, Henan, 450002, China

Abstract:

A CART decision tree algorithm based on attribute weight is proposed in this paper because of the present problems of complex classification, poor accuracy, low efficiency, and severe memory consumption of CART decision. What is more, the algorithm is combined with the parallel computing model of MapReduce. Theory of attribute weights is used in the algorithm. A decision tree is built through the sum of weights, which is decided by the degree that the attributes affect a decision. Thus the accuracy of classification through decision tree is improved. Parallel sorting algorithms of CART decision tree for massive data is implemented through the MapReduce programming technology of cloud computing. All the results of theoretical analysis and experimental comparison show that it is very important to mark attributes by weights through MapReduce. Furthermore, the accuracy of the classification of large sample data sets is improved significantly, classification efficiency of decision tree is improved and the trained time is also significantly reduced.


Submitted on July 24, 2017; Revised on September 25, 2017; Accepted on November 29, 2017
References: 28