期刊名称:Journal of King Saud University @?C Computer and Information Sciences
印刷版ISSN:1319-1578
出版年度:2022
卷号:34
期号:2
页码:264-269
语种:English
出版社:Elsevier
摘要:Data in the timeline of social media users consists of data in the form of text, images, audio, and video. Large and unstructured data in social media can be processed using various techniques such as text processing or image processing. In this study, the processed text data is used to classify Twitter users’ personality based on the DISC framework. Out of the initial collected 292 users, we semi-automatically filtered them for only personal accounts with Indonesian language posts. For being able to observe and assess a user’s personality out of their tweets choice of words, we made relevant keyword vocabularies corresponding to DISC framework and theory. There are four experiment scenarios done in this study, with variations on whether the keywords and text data are stemmed or not, and the keywords frequency calculation being weighted or not. Weighting the keywords using the current number in calculation based on their level does not show positive results, neither does stemming as the best results are shown by the not stemmed and not weighted scenario. This study is a preliminary research for an automatic profiling system which employs a combination of Natural Language Processing and Machine Learning approaches.