期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
印刷版ISSN:2278-1323
出版年度:2012
卷号:1
期号:9
页码:069-076
出版社:Shri Pannalal Research Institute of Technolgy
摘要:This paper presents external sorting using data preprocessing which is a data mining technique that adapted generally here. Generally, huge data of any organization possess data redundancy, noise and data inconsistency. To eliminate, Data preprocessing should be performed on raw data, then sorting technique is applied on it. Data preprocessing includes many methods such as data cleaning, data integration, data transformation and data reduction. Depending on the complexity of given data, these methods are taken and applied on raw data in order to produce quality of data. Then, external sorting is applied. The external sorting now takes the number of passes less than actual passes log B(N/M) + 1, and cost of Input / Outputs is less than 2*N* (log B(N/M) + 1) for the actual of B ¨C way external merge sorting and also involve least number of runs compared to actual basic external sorting.
关键词:data preprocessing; External Sorting; ; Data Cleaning; Passes; Inputs / Outputs; and Runs