Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (5): 849-856.doi: 10.23940/ijpe.18.05.p3.849856

• Original articles • Previous Articles     Next Articles

Decision Tree Incremental Learning Algorithm Oriented Intelligence Data

Hongbin Wang, Ci Chu, Xiaodong Xie, Nianbin Wang, and Jing Sun   

  1. College of Computer Science and Technology, Harbin Engineering University, Harbin, 150001, China

Abstract:

Decision tree is one of the most popular classification methods because of its advantages of easy comprehension. However, the decision tree constructed by existed methods is usually too large and complicated. So, in some applications, the practicability is limited. In this paper, combining NOLCDT with IID5R algorithm, an improved hybrid classifier algorithm, HCS, is proposed. HCS algorithm consists of two phases: building initial decision tree and incremental learning. The initial decision tree is constructed according to the NOLCDT algorithm, and then the incremental learning is performed with IID5R. The NOLCDT algorithm selects the candidate attribute with the largest information gain and divides the node into two branches, which avoids generating too many branches. Thus, this prevents the decision tree is too complex. The NOLCDT algorithm also improves on the selection of the next node to be split, which computes the corresponding nodal splitting measure for all candidate splits, and always selects the node which has largest information gain from all candidate split nodes as the next split node, so that each split has the greatest information gain. In addition, based on ID5R, an improved algorithm IID5R is proposed to evaluate the quality of classification attributes and estimates a minimum number of steps for which these attributes are guaranteed such a selection. HCS takes advantage of the decision tree and the incremental learning method, which is easy to understand and suitable for incremental learning. The contrast experiment between the traditional decision tree algorithm and HCS algorithm with UCI data set is proposed; the experimental results show that HCS can solve the increment problem very well. The decision tree is simpler so that it is easy to understand, and so the incremental phase consumes less time.


Submitted on January 29, 2018; Revised on March 12, 2018; Accepted on April 23, 2018
References: 11