首页    期刊浏览 2025年03月04日 星期二
登录注册

文章基本信息

  • 标题:INCREMENTAL PARALLEL CLASSIFIER FOR BIG DATA WITH CASE STUDY: NAIVE BAYES USING MAPREDUCE PATTERNS
  • 本地全文:下载
  • 作者:VERONICA S. MOERTINI ; MOHAMAD F. SEPTRIANTO ; LIPTIA VENICA
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2019
  • 卷号:97
  • 期号:11
  • 页码:3077-3097
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Classification methods can be used to derive values from big data in the form of models, which then can be utilized to predict new cases. Several parallel classification methods for big data have been developed based on Hadoop MapReduce as well as for Spark system. As big data keeps on coming, the models must be updated from time to time to represent the old as well as the new data. The computations must be efficient and scalable for handling big data. This research aims to enhance the existing parallel classifiers such that they will perform as incremental classifier handling batches of big data. The research results are presented as follows. First, the architecture and main concept of the enhancement is presented. Secondly, the proposed incremental parallel Na�ve Bayes classifier (NBC) based on MapReduce that handles dataset with discrete attributes is discussed in detailed. Two series of experiment were performed on Hadoop clusters with 5 and 10 nodes. The results show that the incremental parallel NBC has acceptable accuracy, is efficient and scalable.
  • 关键词:Big Data Classification Method; Incremental Parallel Classifier; Mapreduce Patterns
国家哲学社会科学文献中心版权所有