摘要:Microarray technology provides an opportunity for scientists to analyze thousands of gene expression profiles simultaneously. However, microarray gene expression data often contain multiple missing expression values due to many reasons. Effective methods for missing value imputation in gene expression data are needed since many algorithms for gene analysis require a complete matrix of gene array values. Several algorithms are proposed to handle this problem, but they have various limitations. In this paper, we develop a novel method to impute missing values in microarray time-series data combining k-nearest neighbor (KNN) and dynamic time warping (DTW). We also analyze and implement several variants of DTW to further improve the efficiency and accuracy of our method. Experimental results show that our method is more accurate compared with existing missing value imputation methods on real microarray time series datasets.
关键词:microarray time series data;missing value imputation;dynamic time warping;k-nearest neighbor