文章基本信息

标题：Integrating K-Means Algorithm with Horizontal Aggregation to Prepare Datasets
本地全文：下载
作者：Brahmini Saraswathi ; K. T. V. Subbarao ; M. M. Balakrishna 等
期刊名称：International Journal of Computer Science & Technology
印刷版ISSN：2229-4333
电子版ISSN：0976-8491
出版年度：2012
卷号：3
期号：2
页码：1086-1090
语种：English
出版社：Ayushmaan Technologies
摘要：To prepare datasets in Datamining concept is too difficult task. Conventional RDBMS usually manage tables with vertical form. To analyze data efficiently, Data mining systems are widely using datasets with columns in horizontal tabular layout. Preparing a data set is more complex task in a data mining project, requires many SQL queries, joining tables and aggregating columns. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout returns set of numbers, instead of one number per row. The system uses one parent table and different child tables, operations are then performed on the data loaded from multiple tables. PIVOT operator, offered by RDBMS is used to calculate aggregate operations. PIVOT method is much faster method and offers much scalability. Partitioning large set of data, obtained from the result of horizontal aggregation, in to homogeneous cluster is important task in this system. K-means algorithm using SQL is best suited for implementing this operation.