期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2019
卷号:97
期号:15
页码:4202-4212
出版社:Journal of Theoretical and Applied
摘要:The development of internet technology increases the need for information management for the public. One of the information forms on the internet is the web page. The increment of web pages which is caused by blog content writing has been simplified by the existence of Content Management System, such as WordPress. However, the information is still needed to be organized orderly so that public can easily get relevant information they are looking for. One solution that can be used to organize information is by using text classification. Thus, the objective of this study is to build a WordPress plugin, namely News Blog Classifier that can help users in classifying their articles automatically. Two different algorithms were used to classify the blog content automatically into Health, Economics, Sports, and Technology category, i.e. K-Nearest Neighbor algorithm and TF-IDF. Furthermore, K-Nearest Neighbor algorithm is improvised by adding Cosine Similarity calculation. Based on the test results, the highest precision value is 0.92 obtained from Health category, the highest recall value is 0.97 from Economics category, and the highest F-measure score is 0.88 from Economics category. Overall, from this study, an automatic text classification plugin for news blog content on WordPress CMS has successfully built that can help bloggers in classifying their news articles automatically. In addition to that, the result from the testing phase has discovered a threshold value for each of the categories used in this study that can be used for further research.
关键词:Content Management System; K-Nearest Neighbor; Text Classification; TF-IDF; WordPress