首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Detecting spam e-mails using stop word TF-IDF and stemming algorithm with Na茂ve Bayes classifier on the multicore GPU
  • 本地全文:下载
  • 作者:Manjit Jaiswal ; Sukriti Das ; Khushboo Khushboo
  • 期刊名称:International Journal of Electrical and Computer Engineering
  • 电子版ISSN:2088-8708
  • 出版年度:2021
  • 卷号:11
  • 期号:4
  • 页码:3168-3175
  • DOI:10.11591/ijece.v11i4.pp3168-3175
  • 语种:English
  • 出版社:Institute of Advanced Engineering and Science (IAES)
  • 摘要:A spam filter is a program which is used to identify unwanted emails and prevents those messages from getting into a user's mail. The study was focused on how the algorithms can be applied on a number of e-mails consisting of both ham and spam e-mails. First, the working principle and steps which are followed for implementation of stop words, TF-IDF and stemming algorithm on NVIDIA’s Tesla P100 GPU are discussed and to verify the findings by executing of Naïve Bayes algorithm. After complete training and testing of the spam e-mails dataset taken from Kaggle by using the proposed method, we got a high training accuracy of 99.67% and got a testing accuracy of about 99.03% on the multicore GPU that boosted the speed of execution of training time period and testing time period which is improved of training and testing accuracy around 0.22% and 0.18% respectively when compared to that after applying only Naïve Bayes i.e. conventional method to the same dataset where we found training and testing accuracy to be 99.45% and 98.85% respectively. Also, we found that training time taken on GPU is 1.361 seconds which was about 1.49X faster than that taken on CPU which is 2.029 seconds. And the testing time taken on GPU is 1.978 seconds which was about 1.15X faster than that taken on CPU which is 2.280 seconds.
  • 关键词:google colab;GPU;Naïve Bayes;NVIDIA;porter’s algorithm;stemming;tesla;TF-IDF
国家哲学社会科学文献中心版权所有