摘要:One relevant problem in data quality is the presence of missing data. In cases where missing data are abundant, effective ways to deal with these absences could improve the performance of Data Mining. Missing data can be treated using imputation. Imputation methods replace the missing data by values estimated from the available data. Missing data imputation is an actual and challenging issue in data mining. This is because missing values in a dataset can generate bias that affects the quality of the learned patterns. To deal with this issue, this paper proposes some Imputation methods, which can impute missing values with negligible biased data. We experimentally evaluate our approach and demonstrate that it is much more efficient than the other available imputation methods.
关键词:KDD (Knowledge Discovery in Databases.) Data mining attribute missing values; Imputation methods; Sampling