首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:AUTOMATIC NEWS BLOG CLASSIFIER USING IMPROVED K-NEAREST NEIGHBOR AND TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY
  • 本地全文:下载
  • 作者:IRMA YUNITA ; SENG HANSUN
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2019
  • 卷号:97
  • 期号:15
  • 页码:4202-4212
  • 出版社:Journal of Theoretical and Applied
  • 摘要:The development of internet technology increases the need for information management for the public. One of the information forms on the internet is the web page. The increment of web pages which is caused by blog content writing has been simplified by the existence of Content Management System, such as WordPress. However, the information is still needed to be organized orderly so that public can easily get relevant information they are looking for. One solution that can be used to organize information is by using text classification. Thus, the objective of this study is to build a WordPress plugin, namely News Blog Classifier that can help users in classifying their articles automatically. Two different algorithms were used to classify the blog content automatically into Health, Economics, Sports, and Technology category, i.e. K-Nearest Neighbor algorithm and TF-IDF. Furthermore, K-Nearest Neighbor algorithm is improvised by adding Cosine Similarity calculation. Based on the test results, the highest precision value is 0.92 obtained from Health category, the highest recall value is 0.97 from Economics category, and the highest F-measure score is 0.88 from Economics category. Overall, from this study, an automatic text classification plugin for news blog content on WordPress CMS has successfully built that can help bloggers in classifying their news articles automatically. In addition to that, the result from the testing phase has discovered a threshold value for each of the categories used in this study that can be used for further research.
  • 关键词:Content Management System; K-Nearest Neighbor; Text Classification; TF-IDF; WordPress
国家哲学社会科学文献中心版权所有