International Journal of Performability Engineering, 2019, 15(1): 252-260 doi: 10.23940/ijpe.19.01.p25.252260

Incremental Integration Algorithm based on Incremental RLID3

Hongbin Wanga, Lei Hub, Xiaodong Xiea, Lianke Zhou,a, and Huafeng Lia

a College of Computer Science and Technology, Harbin Engineering University, Harbin, 150001, China

b General Office, Systems Engineering Research Institute, Beijing, 100094, China

Corresponding authors: * E-mail address:zhoulianke@hrbeu.edu.cn

Accepted: 2018-12-23   Online: 2019-01-1

About authors

Hongbin Wang received his PhD degree in computer application technology from Harbin Engineering University in 2010. He is currently an associate professor in the College of Computer Science and Technology of Harbin Engineering University. His research interests include information management, data integration, data space, semantic web, ontology and information system design.

Lei Hu received his Master’s degree in computer application technology from Wuhan University in 2004. He is currently a senior engineer of SERI. His research interests include information management, systems engineering, and system integration.

Xiaodong Xie is now studying for a Bachelor’s degree in computer application technology at the Harbin Engineering University in 2018. His research interests include deep learning, big data, AI, machine learning.

Lianke Zhou received his PhD degree in computer architecture from Harbin Institute of Technology in 2011. He is currently a lecturer in the College of Computer Science and Technology of Harbin Engineering University. His research interests include data visualization, dataspace, distributed computing and mobile computing. E-mail: zhoulianke@hrbeu.edu.cn.

Huafeng Li received his master degree in software engineering from Harbin Engineering University in 2016. His research interests include information management, data integration, and data classification.

Abstract

In the research process of ID3 algorithm, some deficiencies were found. RLID3 algorithm is on the improvement of ID3 algorithm in terms of the number of leaf nodes. RLID3 algorithm uses ensemble learning method to integrate multiple incremental RLID3 model and the predictive ability of the algorithm is further improved.Incre_RLID3 is an incremental learning algorithm that is based on a decision tree constructed by RLID3. It adjusts construction of the tree using incremental data set. The goal of this algorithm is to use the new data on the basis of the original decision tree. In order to further improve the accuracy of the algorithm, this paper proposes an ensemble algorithm PAR_WT. The basic idea of this algorithm is to use the data set to generate multiple RLID3 decision tree. Then, the test samples are classified by each decision tree. Finally, combined with the PAR_WT algorithm and Incre_RLID3 algorithm, an incremental ensemble algorithm Incre_RLID3_ENM algorithm with incremental learning ability is obtained.

Keywords: incremental integration; ensemble learning; incremental learning; decision tree

PDF (579KB) Metadata Related articles Export EndNote| Ris| Bibtex  Favorite

Cite this article

Hongbin Wang, Lei Hu, Xiaodong Xie, Lianke Zhou, and Huafeng Li. Incremental Integration Algorithm based on Incremental RLID3. International Journal of Performability Engineering, 2019, 15(1): 252-260 doi:10.23940/ijpe.19.01.p25.252260

1. Introduction

In recent years, the research on incremental ensemble method is relatively hot. Some incremental ensemble methods that have emerged include the typical Learn++ algorithm[1-2]and selective ensemble method [3-5]. These methods play a leading role in the research of incremental integration. The effectiveness of these methods is verified by a lot of experiments and applications.

2. The Deficiency of RLID3 Algorithm

In actual production, the data are usually produced in an irregular time. In this case, the data analysis system is required to learn the information incrementally; it must learn from new data while not forgetting about the model learned before. This learning method is called incremental learning. Incremental learning[6] is a learning algorithm that can be used to train the model incrementally. Incremental learning can be applied in cases of uncertain time when new data arrives.

Incremental algorithm is a hot research direction[7]; in recent years, there are many experts and scholars to study it. At present, the research on incremental algorithmis to improve and transform the existing classical classifier, so that it becomes a classifier with incremental function. For example, ID4 algorithm[8]is an improved ID3 algorithm. When the incremental data comes, ID4 algorithm can adjust the decision tree generated by the ID3 algorithm. In this paper, according to the above ideas of ID4 algorithm, the decision tree generated by RLID3 algorithm is adjusted according to the incremental data; Incre_RLID3 algorithm with incremental learning function is obtained.

After years of experiment and application, it is proved that ensemble learning is a kind of machine learning method worth studying [9-10]. Ensemble learning can make up for the shortcoming of the single classifier trained by the data set, which is difficult to fit the actual data well. Ensemble learning uses multiple classifiers to train data sets in some way, and the classification results of these classifiers are used as the final classification results.

It is found that the ensemble learning method of Bagging[11] can choose the incremental classifier as the base classifier. Integration of these incremental classifiers can be an incremental integration algorithm. This incremental ensemble algorithm has good accuracy and incremental learning ability.

3. Incremental RLID3 Algorithm Incre_RLID3

According to the data set, RLID3 algorithm selects the attribute of the maximum decision tree optimization ratio; this attribute is used as the split attribute. Then, according to the value of this attribute, the data subset is divided, and decision tree optimization ratio of each subset is calculated recursively.

Among them, decision tree optimization ratio is the ratio of the information gain of the current attribute and the number of leaves of the decision tree, which is generated by the current node. The formula is as Equation (1).

$DTOR(S, A)=\frac{Gain(S, A)}{LeafNum(S, A)}$

In the above formula, $DTOR(S, A)$ presents decision tree optimization ratio,$Gain(S, A)$ represents the information gain of attribute A on the data set S, and $LeafNum(S, A)$ represents the number of leaves of a decision tree that is formed using a data set S and attribute A.

The concepts used in the algorithm are introduced. Sample feature code is a string used to store data sets; this string represents a sample data. It can obtain the value of each attribute from .arff file in Weka.The value of each attribute is labelled as 0, 1, , (n-1) form. Among them, n is the number of attributes. Each column represents an attribute, and the order of attributes is the same as that of attributes in .arff file in Weka. Sample feature code set refers to a plurality of sample signatures in a form of organization.

The data set of Table 1 that is stored as sample feature code set is represented as the following:000, 000, 101, and 201.

Table 1.   Examples of data collection

No.OutlookHumidityPlay
1sunnyhighno
2sunnyhighno
3overcasthighyes
4rainyhighyes

New window| CSV


In the outlook attribute, the value of sunny is 0, the value of overcast is 1, and the value of rainy is 2.The other attribute marking way is the same as the outlook attribute.

The samples of the data setis added to the sample feature code set by reading each sample data in the data set. Examples of data collection are shown in Table 1.

3.1. Incre_RLID3 Algorithm

Based on the analysis of the last section, it is found that the incremental base classifier is the key problem of the algorithm. In this paper, we use Incre_RLID3 algorithm as the base classifier. The core idea of Incre_RLID3 algorithm is to dynamically adjust decision tree generated by RLID3 algorithm using the incremental data.

Incre_RLID3 algorithm can be divided into the following two stages.

(1) Initial stage

The initial stage is to get the training data set. Before the start of the stage, there is no classifier training. In this stage, RLID3 algorithm is used to train the training set, and the decision tree model is obtained. After the end of this stage, the training model of decision tree is obtained.

(2) Increment stage

The incremental phase happens after completing initial stage and a period of time; it is able to use the incremental data. Before the start of the stage, there is already a good classifier in the initial stage of training. In this stage, Incre_RLID3 algorithm based on the incremental data set is used to adjust the initial stage of the decision tree trained model. After the end of this stage, the decision tree model based on the incremental data set is obtained.

The initial stage algorithm of Incre_RLID3 algorithm uses the RLID3 algorithm. Increment stage of Incre_RLID3 algorithm is shown below. Before using the incremental data set, each data of incremental data set can be classified according to the decision tree of initial stage. The data can be removed from the incremental data set if it can be classified correctly. The incremental data set used by algorithm 1 refers to the set after the above treatment.

Algorithm 1   Update RLID3 Tree (Root node of a RLID3 decision tree, Incremental data set)

1. Input: Root node of a RLID3 decision tree, Incremental data set
2. Output: adjusted decision tree
3. declare a variable curNode, decision tree is used to represent the nodes in the traversal algorithm
4. set the curNode to the root node of the input RLID3 decision tree
5. adding the incremental data set to the sample feature code of the curNode
6. if (The classification of each sample is the same in the sample feature code set of the curNode) {//All samples are divided into similar
7. curNode is set to a leaf node, namely: the curNode attribute is set to null
8. class attribute value of curNode is set to the classification of the sample feature code set of the curNode
9. }else{//The samples belong to the category is not exactly the same
10. if (curNode is a leaf node) {//The attribute of the curNode is null
11. using the feature sample code set of the curNode, according to the formula (1), choosing the attribute with the maximum value of decision tree optimization ratio, the attribute of the curNode is set to the attribute
12. if (decision tree optimization ratio of the attribute of the curNode is 0) {
13. set the curNode to the leaf node, namely: the attribute of the curNode is set to null
14. class attribute value of the curNode is set to the most class of sample feature code set
15. }else{//The maximum value of decision tree optimization ratio is not 0
16. the attribute of the curNode as the attribute of the root node, call the RLID3 algorithm makeRLID3 Tree (Sample feature set of the curNode, the attribute of the curNode) to build RLID3 tree}
17. }else{//the curNode is a branch node
18. using sample feature set of the curNode. According to the formula (1), selecting the attribute of the maximum value of decision tree optimization ratio, denoted as A
19. if(decision tree optimization ratio of A is 0){//
20. the curNode is set to leaf node
21. class attribute value of the curNode is set to the highest value in sample feature code set of the curNode
22. }else{//decision tree optimization ratio of A is not 0
23. The attribute of the curNodeis the same as A.
24. while(sample: sample feature code set of the curNode){
25. add sample to the childrenTrainData[sample partition attribute's value number] array
26. }
27. for (j = 0; j< the number of the curNode’s available; j++)
28. the node that corresponds to the first j value is used as the root node of the sub tree,
29. call RLID3 algorithm,
30. updateRLID3 Tree(the node that corresponds to the first j value, childrenTrainData[j]), subtree traversal
31. }
32. }else{//the attribute of curNode is not the same as A
33. the node that corresponds to the first j value is used as the root node of the sub tree,
34. call RLID3 algorithm,
35. updateRLID3 Tree(the node that corresponds to the first j value, childrenTrainData[j]), subtree traversal
36. }
37. }
38. }
39. }

New window| CSV


3.2. Ensemble Learning based on Parallel Weighted

In this paper, parallel weighted algorithm is a combination of multiple base classifier algorithm, which belongs to ensemble learning algorithm. Parallel weighted refers to the training of multiple base classifiers in parallel, and retains the trained classifier classification accuracy. The accuracy of classification is obtained according to the Equation (3). In this section, RLID3 algorithm is used to generate decision tree, which is used as the base classifier.

Define 1 (weight recorder) using a two-dimensional matrix represents weight recorder. It is used to represent the relationship between the base classifiers and the classified categories. For two dimensional matrix, Line represents base classifier and column represents classification of data sets. The number of rows in a two-dimensional matrix is set to m, and the number of columns is set to the value n of class attribute of the data set. Using A to represent the two-dimensional matrix,aij represents the weight that the i base classifier classifies a sample into category j. The structure of a two-dimensional matrix can be expressed as follows.

$A=\left[ \begin{matrix} {{a}_{11}} & \cdots & {{a}_{1n}} \\ \vdots & \ddots & \vdots \\ {{a}_{m1}} & \cdots & {{a}_{mn}} \\ \end{matrix} \right]$

The weight recorder is used in the PAR_WT algorithm, as shown below.

Ensemble learning algorithm based on parallel weighted is divided into two parts.

(1) Calculatethe weights of each base classifier

In this part, we need to train the multiple base classifiers in parallel to get the accuracy of the classifier. The formula of the classification accuracy is shown in Equation (2).

$precision=\frac{the\text{ }number\text{ }of\text{ }samples\text{ }that\text{ }can\text{ }be\text{ }correctly\text{ }classified}{the\text{ }numbers\text{ }of\text{ }samples}$

The base classifier’s accuracy Equation (2) is brought into the Equation (3).

$weight={{\log }_{2}}(\frac{1}{1-precision})$

In the above Equation (3), weight represents the weight of the base classifier, and precision indicates the classification accuracy. Get a weight with the accuracy of the base classifier as the weight of the base classifier.

(2) Using weights and base classifiers integrates the classification results of the classifiers

i. Using the test set, the base classifier and the weights are assigned to weight recorder.

ii. The sum of each column in a two-dimensional matrix is calculated;the maximum value is used as the classification result.

Bagging is a common way of ensemble learning. Bagging can be trained in parallel to multiple base classifiers, but it simply uses the voting method to comprehensively classify the voting results. The results are used as the prediction results of the classifier. In this paper, PAR_WT algorithm is used to get the weight of the Equation (3). The classification results of the weights are used to classify the classifiers, and the classification results are obtained as the final classification results.

PAR_WT algorithm has two stages. The first stage includes the training base classifier and the weight of the base classifier. The second stage is to combine all the base classifiers to predict the results. Because the main research of this section is ensemble learning, incremental learning is not added to this section. This section selects non-incremental RLID3 algorithm to train the base classifier.

The steps of the PAR_WT algorithm are shown below.

Algorithm 3-1   PAR_WT algorithm

Input: data sets, the number of base classifiers
Output: base classifier and its weight
1. Parallel implementation of the following procedure {//T times T said the number of base classifiers.
2. Being randomly divided into 10 parts of data.
3. Taking out the data set of 9 parts as the training set of the i base classifier.
4. Taking out the data set of 1 parts as the test set of the i base classifier.
5. Using RLID3 algorithm to train thei based classifier.
6. Usingthe formula (2), the classification accuracy of thei base classifier is calculated.
7. Using the formula (3), calculating the weight of thei base classifier.
8. }

New window| CSV


The second step of the PAR_WT algorithm is shown as follows.

Algorithm 3-2   Parallel And Weight Second (pre-classification sample data)

Input: data sets, the number of base classifiers
Output: base classifier and its weight
1. Initialization of the weight recorder, making aij=0
2. for (classifier: classifier set) {//classifier represents the i classifier
3. j = classifier.classify (instance) //j represents the classification results of the classifier for the sample
4. aij += classifier.getWeight();//aij plus the weight of the i classifier
5. }
6. Calculatingthe sum of each column of weight recorder
7. The maximum value of each column is used as the classification results of pre classified data.

New window| CSV


4. Incremental Integration Algorithm based on Incre_RLID3 Algorithm

By combining Incre_RLID3 algorithm and PAR_WT algorithm, an incremental integration algorithm based on RLID3 is proposed. The main task of this section is to use the Incre_RLID3 algorithm to train the base classifier in PAR_WT algorithm. PAR_WT algorithm is divided into two stages.It leads to the incremental integration algorithm of initial stage and incremental stage, whichis divided into two stages.

Initial stage uses PAR_WT algorithm. At this stage, RLID3 algorithm is used as the training base classifier. Initial stage is divided into 2 stages.

(1) Initial stage

Initial stage uses PAR_WT algorithm. At this stage, RLID3 algorithm is used as the training base classifier. Initial stage is divided into 2 stages, which are first stages and second stages, respectively. These two stages are the same as the two stages of the algorithm 2 PAR_WT algorithm.

Algorithm 2   Weight recorder

Input: data set, trained classifier set
Output: weight recorder
1. Initialization of the weight recorder, making aij=0
2. for (instance: data set) {
3. for (classifier: classifier set) {//classifier represents the i classifier
4. j = classifier.classify (instance) //j represents the classification results of the classifier for the sample
5. aij += classifier.getWeight();//aij plus the weight of the i classifier
6. }
7. }

New window| CSV


(2) Incremental stage

Incremental stage is divided into 2 stages, which are first stages and second stages respectively. Incremental integration algorithm is based on the incremental stage of PAR_WT algorithm. The second stage is the same as the PAR_WT algorithm, but the first stage is slightly different. In the first stage, we need to input the parameters for initial stage of all the base classifier, and PAR_WT algorithm does not need to enter this parameter.

The algorithm steps of the incremental stage of Incre_RLID3_ENM algorithm are shown in Algorithm 4.

Algorithm 4   RLID3 Incre Ensemble (incremental data set, initial stage of all base classifiers)

Input: incremental data set, initial stage of all base classifiers
Output: adjusted base classifiers
1. Perform the following procedure T times in parallel{//T represents the number of base classifiers
2. Call algorithm 1, using the incremental data set to adjust the i base classifier to obtain the adjusted i base classifier
3. Test set of the i base classifier of initial stage is used as the test set of the adjusted i base classifier
4. Using the formula (2), the accuracy of the i base classifier is calculated.
5. Using the formula (3), the weight of the i base classifier is calculated.
6. }

New window| CSV


The steps of the first stage are shown below. The steps of the second stage of the algorithm are the same as the second stage of PAR_WT algorithm.

5. Experiment Results and Analysis

The classification problem is studied in the paper. The experiment was carried out using 10 data sets related to classification in the UCI database. The data sets contain a wide range of areas: life, computer, social and games. The basic information of the data sets is shown in Table 2.

Table 2.   Basic information of the data sets

NameInstancesAttributesClassificationsMissing Values
Letter200001626No
HIV659082No
Nursery1296085No
Tic-Tac-Toe95892No
Connect-467557423No
Chess3196362No
Splice3190613No
Mushroom8124222Yes
Lymph148184No
Breast-w699102Yes

New window| CSV


Because the algorithm can deal with the samplesthat must be non-missing value, Mushroom and Breast-w are processed. There are two methods to pre-process the data set with missing values. The first is to remove the stalk-root from the data set; the second is to remove the missing data from the data set. In the Mushroom data set, only the stalk-root attribute misses values, and the proportion of missing data in the data set is large. So, Mushroom is used after removing the stalk-root attribute. After processing, data sets do not have missing value of sample, and small and categorical attributes. The attribute value is not enumerable phenomenon.

5.1. Evaluation Method

In this paper, precision is used as the index evaluation of algorithm; the formula is shown as Equation (4).

$precision=\frac{the\text{ }numbers\text{ }of\text{ }samples\text{ }that\text{ }can\text{ }be\text{ }correctly\text{ }classified}{the\text{ }numbers\text{ }of\text{ }samples}$

5.2. Result and Analyze

The purpose of this section is to record the classification accuracy of Incre_RLID3 algorithm. In this paper, the data set is divided into 4 parts. The part of the initial training set accounts for 40% of the total sample size of the data set, test set accounts for 10%, the first increment accounts for 30%; and the second increment accounts for 20%.

The experimental procedure is as follows.

(1) Use initial training set as a parameter to call initial stage of Incre_RLID3 algorithm to train the classifier. The accuracy of the classifier is calculated using the test set.

(2) Incremental training is performed by using the first increment data set and the classifier as initial stage of Incre_RLID3 algorithm. The accuracy of the classifier is calculated using the test set.

(3) Incremental training is performed by using the second increment data set and the classifier obtained at the last stage as the parameters of Incre_RLID3 algorithm. The accuracy of the classifier is calculated using the test set.

In order to visually show the experimental results of Incre_RLID3 algorithm, the experiment results are shown in Figure 1.

Figure 1.

Figure 1.   Precision diagram of Incre_RLID3 algorithm


The results show that the classification accuracy of the next stage of Incre_RLID3 algorithm is not necessarily higher than that of the previous stage. The accuracy of each stage will not use the entire data set than direct training and testing of the RLID3 algorithm with high accuracy. The advantage of using Incre_RLID3 algorithm is that it is able to incrementally train data.

From parallel weighted, it can be seen that the parallel weighted experiment needs to determine the kind of classification algorithm and the number of base classifiers. In the experiment, 3 base classifiersbased on the RLID3 algorithm are used.

Ensemble learning needs to consider how to split the data set efficiently and reasonably. The way in which the data set is processed is as follows.

01: The data set is divided into data set 1 and data set according to the ratio of 9:1.

02: Perform the following procedure 3 times in parallel {

03: The samples of 9/10 is extracted randomly from the data set 1; these samples are used as the training set of base classifier i

04: The remaining 1/10 samples of the data set 1 are taken as the test set of base classifieri

05: }

06: The data set 2 is used as the test set of the integrated classifier.

Using the above method to process data sets, it is ensured that the test set has no intersection with the training set, and the base classifier can be trained with more data. The training set is randomly extracted.Generally, a training set of 3 base classifiers is not exactly the same.

Figure 2 shows the accuracy of the integrated classifiers.

Figure 2.

Figure 2.   Precision diagram of PAR_WT algorithm


From the experimental result, it can be seen that after the parallel weighted algorithm is processed, the accuracy of integrated RLID3 algorithm is higher than or equal to the base classifier's accuracy. It shows that the prediction result of ensemble classifier is better than the single classifier.

Incremental integration experiment is a combination of incremental and ensemble learning. In the experiment, the data set is divided into the initial data set, the incremental data set for the first increment experiment, the incremental data set for the second increment experiment and the test set with 4 disjoint sets of data. In this section of the experiment, 3 base classifiers are used. Incre_RLID3 algorithm are used to train the classifier.

The experimental procedure using the data set to carry out this section is shown below.

(1) Samples are randomly selected from the data set: 40% for the initial data set, 30% for the first incremental data set, 20% for the second incremental data set, and 10% for the test at each stage of the algorithm.

(2) Initial data set is processed 3 times in parallel; each time the initial data set randomly draws 9/10 of the samples as the training set of base classifier i. The remaining 1/10 of the initial data set is used as the test set for base classifier i.

(3) 3 test sets obtained from step (2) are used as the test set of the first increment experiment and the second increment experiment. According to the Formula (3), the weights of the 3 base classifiers are calculated.

Figure 3 shows the classification results of the ensemble classifier in the initial stage, the first increment stage and the second increment stage.

Figure 3.

Figure 3.   Precision of each stage of Incre_RLID3_ENM


From the experimental result, it can be seen that Incre_RLID3_ENM algorithm has a good precision. And the algorithm can process the incremental data.

6. Conclusion

At present, the research on incremental integration algorithm is very hot; it is mainly focused on the improvement and application of incremental integration algorithm. Incre_RLID3 algorithm is proposed in this paper. The algorithm is based on RLID3, which does not have the increment function. Although the bagging method can be used to train a number of models in parallel, it simply uses the voting method to synthesize the voting results of each model. The results are used as the final prediction results of all models. PAR_WT algorithm will use the Formula (2) to get the weight, use weights to predict the results of all models, and the obtained prediction results are used as the final prediction results of the model. Incre_RLID3 and PAR_WT are combined, and the Incre_RLID3_ENM algorithm is obtained. Further research can consider how to further improve the speed of incremental learning algorithm construction, as well as to improve the form of parallel construction. Further research can also consider the possibility of further reducing the spatial storage of incremental integration algorithms.

Acknowledgements

This work was funded by the National Natural Science Foundation of China under Grant (No. 61772152 and No. 61502037), and the Basic Research Project (No. JCKY2016206B001, JCKY2014206C002 and JCKY2017604C010).

Reference

C. L. Cassano, A. J. Simon, L. Wei, C. Fredrickson, Z. H. Fan ,

“Use of Vacuum Bagging for Fabricating Thermoplastic Microfluidic Devices, ”

Lab on a Chip, Vol. 15, No. 1, pp. 62, 2015

DOI:10.1039/c4lc00927d      URL     PMID:4256099      [Cited within: 1]

In this work we present a novel thermal bonding method for thermoplastic microfluidic devices. This simple method employs a modified vacuum bagging technique, a concept borrowed from the aerospace industry, to produce conventional thick substrate microfluidic devices, as well as multi-layer film devices. The bonds produced using this method are superior to those obtained using conventional thermal bonding methods, including thermal lamination, and are capable of sustaining burst pressures in excess of 550 kPa. To illustrate the utility of this method, thick substrate devices were produced, as well as a six-layer film device that incorporated several complex features.

Y. Chen, J. Cao, S. Feng, Y. Tan ,

“An Ensemble Learning based Approach for Building Airfare Forecast Service, ”

inProceedings of IEEE International Conference on Big Data, pp. 964-969, 2015

DOI:10.1109/BigData.2015.7363846      URL     [Cited within: 1]

Modern airlines use sophisticated pricing models to maximize their revenue, which results in highly volatile airfares. Without sufficient information, it is usually difficult for ordinary customers to estimate future price changes. Over the last few years, several studies have tried to solve the problem of optimal purchase timing for flight tickets, in which the prediction task is described as a binary classification concerning to buy or wait at a given point. However, forecasting the real-time price changes has never received much attention from the research community. In this paper, we address the problem of airfare forecast and present a systematic approach that covers the most important aspects of building a forecast service, including data modelling, forecast algorithm and long-term prediction strategies. A novel matrix-like data schema is first introduced to organize price series and extract temporal features. For the prediction task, we specifically investigate Learn++.NSE, an incremental ensemble classifier designed for learning in nonstationary environments. We propose a modification of the original algorithm to make a regressor that is capable of learning incrementally from streaming price series, with an extra ability of multi-step ahead forecasting. We further evaluate the forecast model on real-world price data collected from diverse routes and discuss its performance with respect to short-term and long-term prediction.

B. Gu, V. S. Sheng, Z. Wang, D. Ho, S. Osman, S. Li ,

“Incremental Learning for ν-Support Vector Regression, ”

Neural Networks, Vol. 67, No. C, pp. 140, 2015

DOI:10.1016/j.neunet.2015.03.013      URL     PMID:25933108      [Cited within: 1]

The ν-Support Vector Regression (ν-SVR) is an effective regression algorithm, which has the advantage of using a parameter ν on controlling the number of support vectors and adjusting the width of the tube automatically. However, compared to ν-Support Vector Classification (ν-) (Sch02lkopf et al., 2000), ν-SVR introduces an additional linear term into its objective function. Thus, directly applying the accurate on-line ν-algorithm (AONSVM) to ν-SVR will not generate an effective initial solution. It is the main challenge to design an incremental ν-SVR algorithm. To overcome this challenge, we propose a special procedure called initial adjustments in this paper. This procedure adjusts the weights of ν-based on the Karush-Kuhn-Tucker (KKT) conditions to prepare an initial solution for the incremental . Combining the initial adjustments with the two steps of AONSVM produces an exact and effective incremental ν-SVR algorithm (INSVR). Theoretical analysis has proven the existence of the three key inverse matrices, which are the cornerstones of the three steps of INSVR (including the initial adjustments), respectively. The experiments on benchmark datasets demonstrate that INSVR can avoid the infeasible updating paths as far as possible, and successfully converges to the optimal solution. The results also show that INSVR is faster than batch ν-SVR algorithms with both cold and warm starts.

Y. Guo, L. Jiao, S. Wang, S. Wang, F. Liu, K. Rong , et al.,

“A Novel Dynamic Rough Subspace based Selective Ensemble, ”

Pattern Recognition, Vol. 48, No. 5, pp. 1638-1652, 2015

DOI:10.1016/j.patcog.2014.11.001      URL    

61A new framework for rough set ensemble and algorithm DRSSE is proposed.61Dynamic searching space is used to increase the diversity of rough subspaces.61The relationship among attributes is considered to reduce the searching space.61Consider the accuracy and diversity of base classifiers in an ensemble system.

B. Han, B. He, N. Rui, M. Ma, S. Zhang, M. Li , et al.,

“LARSEN-ELM: Selective Ensemble of Extreme Learning Machines using LARS for Blended Data, ”

Neurocomputing, Vol. 149, pp. 285-294, 2015

DOI:10.1016/j.neucom.2014.01.069      URL     [Cited within: 1]

Extreme learning machine (ELM) as a neural network algorithm has shown its good performance, such as fast speed, simple structure etc, but also, weak robustness is an unavoidable defect in original ELM for blended data. We present a new machine learning framework called “LARSEN-ELM” to overcome this problem. In our paper, we would like to show two key steps in LARSEN-ELM. In the first step, preprocessing, we select the input variables highly related to the output using least angle regression (LARS). In the second step, training, we employ Genetic Algorithm (GA) based selective ensemble and original ELM. In the experiments, we apply a sum of two sines and four datasets from UCI repository to verify the robustness of our approach. The experimental results show that compared with original ELM and other methods such as OP-ELM, GASEN-ELM and LSBoost, LARSEN-ELM significantly improves robustness performance while keeping a relatively high speed.

L. Hu, C. Shao, J. Li, H. Ji ,

“Incremental Learning from News Events, ”

Knowledge-based Systems, Vol. 89, No. C, pp. 618-626, 2015

DOI:10.1016/j.knosys.2015.09.007      URL     [Cited within: 1]

As news events on the same subject occur, our knowledge about the subject will accumulate and become more comprehensive. In this paper, we formally define the problem of incremental knowledge learning from similar news events on the same subject, where each event consists of a set of news articles reporting about it. The knowledge is represented by a topic hierarchy presenting topics at different levels of granularity. Though topic (hierarchy) mining from text has been researched a lot, incremental learning from similar events remains under developed. In this paper, we propose a scalable two-phase framework to incrementally learn a topic hierarchy for a subject from events on the subject as the events occur. First, we recursively construct a topic hierarchy for each event based on a novel topic model considering the named entities and entity types in news articles. Second, we incrementally merge the topic hierarchies through top-down hierarchical topic alignment. Extensive experimental results on real datasets demonstrate the effectiveness and efficiency of the proposed framework in terms of both qualitative and quantitative measures.

N. Li, Y. Jiang, Z. H. Zhou ,

“Multi-Label Selective Ensemble, ”

in Proceedings of International Workshop on Multiple Classifier Systems, pp. 76-88, 2015

[Cited within: 1]

D. Peressutti, W. Bai, T. Jackson, M. Sohal, A. Rinaldi, D. Rueckert , et al.,

“Prospective Identification of CRT Super Responders using a Motion Atlas and Random Projection Ensemble Learning, ”

inProceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 493-500, 2015

DOI:10.1007/978-3-319-24574-4_59      URL     [Cited within: 1]

Cardiac Resynchronisation Therapy (CRT) treats patients with heart failure and electrical dyssynchrony. However, many patients do not respond to therapy. We propose a novel framework for the prospective characterisation of CRT ‘super-responders’ based on motion analysis of the Left Ventricle (LV). A spatio-temporal motion atlas for the comparison of the LV motions of different subjects is built using cardiac MR imaging. Patients likely to present a super-response to the therapy are identified using a novel ensemble learning classification method based on random projections of the motion data. Preliminary results on a cohort of 23 patients show a sensitivity and specificity of 70% and 85%.

J.C. Schlimmer and D. H. Fisher ,

“A Case Study of Incremental Concept Induction, ”

in Proceedings of National Conference on Artificial Intelligence, pp. 496-501, Philadelphia, Pa, August 11-15, 1986

[Cited within: 1]

X. Z. Wang, H. J. Xing, Y. Li, Q. Hua, C. R. Dong, W. Pedrycz ,

“A Study on Relationship Between Generalization Abilities and Fuzziness of Base Classifiers in Ensemble Learning, ”

IEEE Transactions on Fuzzy Systems, Vol. 23, No. 5, pp. 1638-1654, 2015

DOI:10.1109/TFUZZ.2014.2371479      URL     [Cited within: 1]

We investigate essential relationships between generalization capabilities and fuzziness of fuzzy classifiers (viz., the classifiers whose outputs are vectors of membership grades of a pattern to the individual classes). The study makes a claim and offers sound evidence behind the observation that higher fuzziness of a fuzzy classifier may imply better generalization aspects of the classifier, especially for classification data exhibiting complex boundaries. This observation is not intuitive with a commonly accepted position in “traditional” pattern recognition. The relationship that obeys the conditional maximum entropy principle is experimentally confirmed. Furthermore, the relationship can be explained by the fact that samples located close to classification boundaries are more difficult to be correctly classified than the samples positioned far from the boundaries. This relationship is expected to provide some guidelines as to the improvement of generalization aspects of fuzzy classifiers.

M. S. Zia and M. A. Jaffar ,

“An Adaptive Training based on Classification System for Patterns in Facial Expressions using SURF Descriptor Templates,

” Kluwer Academic Publishers, 2015

DOI:10.1007/s11042-013-1803-3      URL     [Cited within: 1]

Most facial expression recognition (FER) systems used facial expression data created during a short period of time and this data is used for learning/training of FER systems. There are many facial expression patterns (i.e. a particular expression can be represented in many different patterns) which cannot be generated and used as learning/training data in a short time. Therefore, in order to maintain its high accuracy and robustness for a long time of a facial expression recognition system, the classifier should be evolved adaptively over time and space. We proposed a facial expression recognition system that has the aptitude of incrementally learning and thus can learn all possible patterns of expressions that may be generated in feature. After extraction of region of interest (face), the system extracts Speeded-Up Robust Features (SURF). A novel SURF descriptor template based nearest neighbor classifier is proposed for classification. This classifier is used as base/weak classifier for incremental learning algorithm Learn++. A vast range of experimentation is performed on five different databases that demonstrate the incremental learning capability of the proposed system. The experiments using the incrementally learning classification demonstrate promising results.

/