Int J Performability Eng ›› 2017, Vol. 13 ›› Issue (4): 383-389.doi: 10.23940/ijpe.17.04.p5.383389

• Original articles • Previous Articles     Next Articles

Clustering-Based Feature Selection Framework for Microarray Data

Smita Chormungea and Sudarson Jenab   

  1. aResearch Scholar, Department of Computer Science and Engineering, GITAM University, Hyderabad, INDIA
    bDepartment of Information Technology, GITAM University, Hyderabad, INDIA

Abstract:

Gene’s expression data contains hundreds to thousands of features. It is challenging for machine learning algorithms to find the relevant information from such huge and correlated data. Irrelevant and redundant features are computationally costly and decrease the accuracy of machine learning algorithms. Feature selection plays important role to solve the problem of dimensionality. But most of the traditional feature selection algorithms fail to scale on high dimensionality problems. In this paper Clustering based Feature Selection Framework named as (CFSF) is proposed. CFSF produces optimal feature subset by eliminating irrelevant features using clustering algorithm and redundant features by applying filter measure on each cluster. Extensive experiments are carried out to compare proposed framework and other representative methods with respect to two classifiers namely Naive Bayes and Instance Based on microarray datasets. The empirical study demonstrates that the proposed framework is very efficient and effective for producing optimal feature subset and improves classifier performance.


Submitted on December 4, 2016; Revised on May 7, 2017; Accepted on June 18, 2017
References: 25