期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2019
卷号:97
期号:20
页码:2441-2451
出版社:Journal of Theoretical and Applied
摘要:Document Classification that until now still be done by experts / human in the field related to the document. In recent years researchers included in this study proposed a variety of methods to solve the problem of document classification automatically. In this research, we used classification of movie genre based on synopsis as research object. Previous research proposed a model with a statistical approach with the Naive Bayes algorithm that proved to get the best results compared to other classification algorithms. Several studies have proposed adding a selection of N-gram features to pre-processing. The resulting classification becomes better than before. However, there is a weakness of N-gram between is the value of n which is determined still randomly or by trial and error. With these weaknesses, in this study proposes to optimize the N-gram model to obtain optimum n values using the Particle Swarm Optimization algorithm. Acquisition of an optimal N-gram n value will improve document performance and classification results with Naive Bayes. Based on the proposed model can then be used as a document classification model with different objects.