首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Document Clustering: How to Measure Quality of Clusters in Absence of Ground Truth
  • 本地全文:下载
  • 作者:Iti Sharma ; Harish Sharma
  • 期刊名称:International Journal of Computer Science & Technology
  • 印刷版ISSN:2229-4333
  • 电子版ISSN:0976-8491
  • 出版年度:2018
  • 卷号:9
  • 期号:2
  • 页码:28-32
  • 语种:English
  • 出版社:Ayushmaan Technologies
  • 摘要:Demand of simple and scalable clustering algorithms for text documents is increasing as the volume of data generated by through internet is exploding. There are no known classes for such data and extrinsic measures of quality are not sufficient to guide about which algorithm is better for an application. This paper suggests four different intrinsic measures that can be used to evaluate cluster output and hence the clustering method to suit a particular application. The proposed metrics measure homogeneity and coherence of documents in a cluster as well as the overlap among different clusters in an interpretable form.
  • 关键词:Document Clustering;Spherical K-Means;Intrinsic Measures;Performance;Cluster Quality
国家哲学社会科学文献中心版权所有