文章基本信息

标题：Machine Learning based Optimization Scheme for Detection of Spam and Malware Propagation in Twitter
本地全文：下载
作者：Savita Kumari Sheoran ; Partibha Yadav
期刊名称：International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN：2158-107X
电子版ISSN：2156-5570
出版年度：2021
卷号：12
期号：2
页码：495-503
DOI：10.14569/IJACSA.2021.0120262
出版社：Science and Information Society (SAI)
摘要：Social networking sites are new generation of web-services providing global community of users in an online environment. Twitter is one of such popular social networks having more than 152 million daily active users making a half billions of tweets per day. Owing to its immense popularity, the accounts of legitimate Twitter users are always at a risk from spammers and hackers. Spam and Malware are the most affecting threats reported on the Twitter platform. To preserve the privacy and ensure data safety for online Twitter community, it is necessary develop a framework to safeguard their accounts from such malicious attackers. Machine Learning is recently matured and widely used technique useful to prevent the propagation of such malicious activities in social media. However, the Machine Learning based techniques have yielded a promising result in filtering the undesired contents from the user tweets but its efficiency always remains restricted within the technological limits of the technique used. To devise a more efficient model to detect propagation of spam and malware in the Twitter, this research has proposed a Machine Learning based optimization scheme based on hybrid similarity (Cosine and Jaccard) measured in conjunction with Genetic Algorithm (GA) and Artificial Neural Network (ANN). The Cosine with Jaccard in hybridization has been applied on the Twitter dataset to identify the tweets containing spam and malware. In conjunction to it the GA has been used to enhance the training rate and reduce training error by automatically selecting the designed fitness function while the ANN was applied to classify malicious tweets from through voting rule. The simulation experiments were conducted to compute the precision rate, recall, F-measures. The results of Machine Learning based optimization scheme proposed in this research were compared with the existing state-of-arts techniques already available in this regime. The comparative study reveals that the model proposed in this research is faster and more precise then the existing models.
关键词：Social networking sites; Twitter; spam; malware; Cosine similarity; Jaccard similarity; genetic algorithm; artificial neural network