期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2012
卷号:3
期号:5
页码:5149-5156
出版社:TechScience Publications
摘要:Data mining is a powerful and new method of analyzing data and finding out new patterns from large set of data. The objective of data mining is to pull out knowledge from a data set in an understandable format. Data mining is the process of collecting, extracting and analyzing large data set from different perspectives. There is an enormous amount of data stored in databases and data warehouse due to enormous technological advancements in computing and Internet. It is therefore, required, to develop powerful tools for analysis of such huge data and mining valuable information out of it. One of the main challenges in database mining is developing fast and efficient algorithms that can handle large volumes of data as most of the mining algorithms perform computation over the entire databases, often very large. Data mining is a convenient way of extracting patterns, which represents knowledge implicitly stored in large data sets and focuses on issues relating to their feasibility, usefulness, effectiveness and scalability. It can be viewed as an essential step in the process of knowledge discovery. Data are normally preprocessed through data cleaning, data integration, data selection, and data transformation and prepared for the mining task. Data mining can be performed on various types of databases and information repositories, but the kind of patterns to be found are specified by various data mining functionalities like class description, association, correlation analysis, classification, prediction, cluster analysis etc. This paper gives an overview of the existing data mining algorithms required for the same.
关键词:Data mining; KDD; Clustering; Association Rule;Classification; Sequential and parallel Algorithms