首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:UNSUPERVISED FEATURE SELECTION FOR TEXT CLUSTERING USING DIFFERENTIAL INVERSE DOCUMENT FREQUENCY
  • 本地全文:下载
  • 作者:Sivaram Prasad Nalluri ; Rajasekhara Rao Kurra
  • 期刊名称:Indian Journal of Computer Science and Engineering
  • 印刷版ISSN:2231-3850
  • 电子版ISSN:0976-5166
  • 出版年度:2021
  • 卷号:12
  • 期号:4
  • 页码:790-797
  • DOI:10.21817/indjcse/2021/v12i4/211204014
  • 语种:English
  • 出版社:Engg Journals Publications
  • 摘要:Text clustering is gaining importance among researchers because of rapid increase in the availability of online text collections without class labels. It helps to organize, summarize and retrieve useful information from corpora. High dimensionality of text datasets leads to poor performance of clustering algorithms. Dimensionality can be reduced using feature extraction or feature selection methods. Feature selection methods scale well and are easy to interpret. An unsupervised univariate filter feature selection method was proposed for dimensionality reduction. The proposed method outperformed nine other filter methods reported in the literature, by identifying most relevant features that lead to good clustering performance on eight popular text datasets.
  • 关键词:Feature Selection;Unsupervised;Filter Method;Text Clustering;Differential Inverse Document Frequency
国家哲学社会科学文献中心版权所有