首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:HARPP: HARnessing the Power of Power sets for Mining Frequent Itemsets
  • 本地全文:下载
  • 作者:Muhammad Yasir ; Muhammad Asif Habib ; Shahzad Sarwar
  • 期刊名称:Public Policy And Administration
  • 印刷版ISSN:2029-2872
  • 出版年度:2019
  • 卷号:48
  • 期号:3
  • 页码:415-431
  • DOI:10.5755/j01.itc.48.3.21137
  • 出版社:Kaunas University of Technology
  • 摘要:Modern algorithms for mining frequent itemsets face noteworthy deterioration of performance when minimum support tends to decrease, especially for sparse datasets. Long-tailed itemsets, frequent itemsets found at lower minimum support, are significant for present-day applications such as recommender systems. In this study, we have developed a novel power set based method named as HARnessing the Power of Power sets (HARPP) for mining frequent itemsets. HARPP iteratively generates power sets to make combinations of overlapping varying-sized subsets of I, where I is a set of items in a large database. Intrinsic feature of creating power sets along with the use of set data structure ensures the agility of HARPP because most of its operations take constant running time. Without storing it entirely in memory, HARPP scans the dataset only once and mines frequent itemsets on the fly. In contrast to state-of-the-art, efficiency of HARPP increases with decrease in minimum support that makes it a viable technique for mining long-tailed itemsets. Performance study shows that HARPP is efficient and scalable, and is faster up to two orders of magnitude than FP-Growth algorithm at lower minimum support particularly when datasets are sparse.
  • 关键词:Association Rules;Frequent Itemset Mining;Apriori;FP-Growth;Recommendation Systems
国家哲学社会科学文献中心版权所有