期刊名称:International Journal of Hybrid Information Technology
印刷版ISSN:1738-9968
出版年度:2015
卷号:8
期号:7
页码:277-292
DOI:10.14257/ijhit.2015.8.7.26
出版社:SERSC
摘要:DNA microarray technique can detect tens of thousands of genes activity in cells and has been widely used in clinical diagnosis. However, microarray data has characteristics of high dimension and small samples, moreover many irrelevant and redundant genes also decrease performance of classification algorithm .Mutual information is very effective method and has widely been used in feature gene selection, but it cannot directly deal with continuous features. Therefore, this paper proposes a novel feature gene selection method to resolve this problem. Firstly, a lot of irrelevant genes are eliminated from original data by using reliefF algorithm , and the candidate subset of genes is obtained; Secondly, a algorithm based on neighborhood mutual information and forward greedy search strategy which deals with directly continuous features is proposed to select feature genes in above genes subset. Here, because radius of neighborhood greatly affects reduction performance, differential evolution algorithm is applied to optimize radius before reduction. The simulation results on six benchmark microarray datasets show that our method can obtain higher classification accuracy using as few genes as possible, especially neighborhood mutual information can directly continuous features. Feature genes selected has an important meaning for understanding microarray data and finding pathogenic genes of cancer. It is an effective and efficient method for feature genes selection.