首页    期刊浏览 2024年12月11日 星期三
登录注册

文章基本信息

  • 标题:Spam Detection on Profile and Social Media Network using Principal Component Analysis (PCA) and K-means Clustering
  • 本地全文:下载
  • 作者:Samuel Ady Sanjaya ; Kridanto Surendro
  • 期刊名称:International Journal of Advances in Soft Computing and Its Applications
  • 印刷版ISSN:2074-8523
  • 出版年度:2019
  • 卷号:11
  • 期号:3
  • 页码:108-123
  • 出版社:International Center for Scientific Research and Studies
  • 摘要:Social media as a means of communicating in cyberspace continues to grow both from the number of users, utilization, and the resulting impact. Existing social media ecosystems are influenced by the influence of public figures, trending topics, even spam, and spammers. Detection of spam accounts that have been done mostly using the method of classification or supervised learning. This will be a problem if the data is new and the supervised model is not updated it will increase the possibility of false detection. Based on the problem, this study will use Principal Component Analysis (PCA) and K-means clustering with Mahalanobis distance as a method to detect a collection of users who have similar properties to determine spam. This study uses 150 thousand twitter data with 15 thousand account data that described as graph data. The result, we find that error detection in the classification method to find spam is a class that made only two: spam and non-spam. Though in addition there are still other classes that have the characteristics of spam when it is not. In this paper, we defined the clusters on to 5 clusters: normal, news account and public activist, foreign account, public figure, and spam.
  • 关键词:K-means; Principal Component Analysis (PCA); Social Media; Social Network Analysis; Spam.
国家哲学社会科学文献中心版权所有