文章基本信息

标题：DATABASE PREPROCESSING AND COMPARISON BETWEEN DATA MINING METHODS
本地全文：下载
作者：Yas A. Alsultanny
期刊名称：International Journal of New Computer Architectures and their Applications
印刷版ISSN：2220-9085
出版年度：2011
卷号：1
期号：1
页码：61-73
出版社：Society of Digital Information and Wireless Communications
摘要：Database preprocessing is very important to utilize memory usage, compression is one of the preprocessing needed to reduce the memory required to store and load data for processing, the method of compression introduced in this paper was tested, by using proposed examples to show the effect of repetition in database, as well as the size of database, the results showed that as the repetition increased the compression ratio will be increased. The compression is one of the important activities for data preprocessing before implementing data mining. Data mining methods such as Na".ve Bayes, Nearest Neighbor and Decision Tree are tested. The implementation of the three methods showed that Na".ve Bayes method is effectively used when the data attributes are categorized, and it can be used successfully in machine learning. The Nearest Neighbor is most suitable when the data attributes are continuous or categorized. The third method tested is the Decision Tree, it is a simple predictive method implemented by using simple rule methods in data classification. The success of data mining implementation depends on the completeness of database, that represented by data warehouse, that must be organized by using the important characteristics of data warehouse.
关键词：Data mining; Preprocessing; Nearest ; Neighbour; Na".ve Bayes; Decision tree