首页    期刊浏览 2024年12月14日 星期六
登录注册

文章基本信息

  • 标题:FEATURE SELECTION METHODS FOR PREDICTING THE POPULARITY OF ONLINE NEWS: COMPARATIVE STUDY, AND A PROPOSED METHOD
  • 本地全文:下载
  • 作者:SAMAH OSAMA M. KAMEL ; MOHAMED NOUR
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2018
  • 卷号:96
  • 期号:20
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Nowadays, accessing the Internet has become interesting for the people�s life. It will be promising if it can accurately predict the popularity of news prior to its publication. Online classification is well suited for learning from large and high dimensional dataset. The main objective of this research work is to predict and evaluate the popularity of online news. Several approaches of feature selection will be adopted to reduce the dataset to improve the classification and prediction accuracy. Some filtering approaches will be used such as correlation, information gain and relief to remove the non-important features so that the classification of new instances will be more accurate. The above mentioned approaches will be presented for selecting the most significant features in the dataset and then providing comparison among their performance. Moreover, Bayes Network and K-Nearest Neighbors algorithms are trained for classification and prediction. The training set is used to construct the models while the testing set is used for validation. This work will be operated and tested using a dataset taken from the UCI machine learning repository containing thousands of articles with sixty-two attributes. A feature selection method is proposed based on features' extraction and/or features' fusion. A comparative study is done among the adopted methods and the novel proposed one. The performance of the adopted classification and prediction models and/or approaches will consider some measurable criteria such as precision, recall, accuracy and error for highlighting the advantages and disadvantages of the adopted approaches and the proposed one. From the experimental work, the performance of the proposed method is promising and outperforms those adopted ones.
  • 关键词:Feature Selection; Classification Methods; Popularity Prediction; High Dimensional Datasets; and Online News.
国家哲学社会科学文献中心版权所有