首页    期刊浏览 2025年03月01日 星期六
登录注册

文章基本信息

  • 标题:Frequent Item Set Mining Approach Using Mapreduce in Hadoop Environment: A Survey
  • 本地全文:下载
  • 作者:Sumit Gurav ; Dhanaji Salavi ; Sushil Gaikwad
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2017
  • 卷号:5
  • 期号:1
  • 页码:891
  • DOI:10.15680/IJIRCCE.2017.0501187
  • 出版社:S&S Publications
  • 摘要:Data mining is the extraction of hidden predictive information from large databases, is a powerful newtechnology with great potential to help companies as well as research focus on the most important information in theirdata warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive,knowledge-driven decisions. Frequent Itemset Mining is one of the classical data mining problems in most of the datamining applications. It requires very large computations and I/O traffic capacity. Also resources like single processor’smemory and CPU are very limited, which degrades the performance of algorithm. In this paper we have proposed onesuch distributed algorithm which will run on Hadoop – one of the recent most popular distributed frameworks whichmainly focus on mapreduce paradigm. The proposed approach takes into account inherent characteristics of the Apriorialgorithm related to the frequent itemset generation and through a block-based partitioning uses a dynamic workloadmanagement. The algorithm greatly enhances the performance and achieves high scalability compared to the existingdistributed Apriori based approaches. Proposed algorithm is implemented and tested on large scale datasets distributedover a cluster.
国家哲学社会科学文献中心版权所有