首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:Different Type of Feature Selection for Text Classification
  • 本地全文:下载
  • 作者:M.Ramya ; J.Alwin Pinakas
  • 期刊名称:International Journal of Computer Trends and Technology
  • 电子版ISSN:2231-2803
  • 出版年度:2014
  • 卷号:10
  • 期号:2
  • 页码:102-107
  • DOI:10.14445/22312803/IJCTT-V10P118
  • 出版社:Seventh Sense Research Group
  • 摘要:Text categorization is the task of deciding whether a document belongs to a set of pre specified classes of documents. Automatic classification schemes can greatly facilitate the process of categorization. Categorization of documents is challenging, as the number of discriminating words can be very large. Many existing algorithms simply would not work with these many numbers of features. For most text categorization tasks, there are many irrelevant and many relevant features. The main objective is to propose a text classification based on the features selection and preprocessing thereby reducing the dimensionality of the Feature vector and increase the classification accuracy. In the proposed method, machine learning methods for text classification is used to apply some text preprocessing methods in different dataset, and then to extract feature vectors for each new document by using various feature weighting methods for enhancing the text classification accuracy. Further training the classifier by Naive Bayesian (NB) and Knearest neighbor (KNN) algorithms, the predication can be made according to the category distribution among this k nearest neighbors. Experimental results show that the methods are favorable in terms of their effectiveness and efficiency when compared with other.
  • 关键词:Feature selection; K-Nearest Neighbor; Naïve Bayesian; Text classification.
国家哲学社会科学文献中心版权所有