期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2021
卷号:12
期号:10
DOI:10.14569/IJACSA.2021.0121032
语种:English
出版社:Science and Information Society (SAI)
摘要:Information gets spread rapidly in the world of the internet. The internet has become the first choice of people for medication tips related to their health problems. However, this ever-growing usage of the internet has also led to the spread of misinformation. The misinformation in healthcare has severe effects on the life of people, thus efforts are required to detect the misinformation as well as fact-check the information before using it. In this paper, the authors proposed a model to detect and fact-check the misinformation in the healthcare domain. The model extracts the healthcare-related URLs from the web, pre-processes it, computes Term-Frequency, extracts sentimental and grammatical features to detect misinformation, and computes distance measures viz. Euclidean, Jaccard, and Cosine similarity to fact-check the URLs as True or False based on the manually generated dataset with expert’s opinions. The model was evaluated using five state-of-the-art machine learning classifiers Logistic Regression, Support Vector Machine, Naïve Bayes, Decision Tree, and Random forest. The experimental results showed that the sentimental features are crucial while detecting misinformation as more negative words are found in URLs containing misinformation compared to the URLs having true information. It was observed that Naïve Bayes outperformed all other models in terms of accuracy showing 98.7% accuracy whereas the decision tree classifier showed less accuracy compared to all other models showing an accuracy of 92.88%. Also, the Jaccard Distance measure was found to be the best distance measure algorithm in terms of accuracy compared to Euclidean distance and Cosine similarity measures.