期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2018
卷号:9
期号:2
页码:28-32
语种:English
出版社:Ayushmaan Technologies
摘要:Demand of simple and scalable clustering algorithms for text documents is increasing as the volume of data generated by through internet is exploding. There are no known classes for such data and extrinsic measures of quality are not sufficient to guide about which algorithm is better for an application. This paper suggests four different intrinsic measures that can be used to evaluate cluster output and hence the clustering method to suit a particular application. The proposed metrics measure homogeneity and coherence of documents in a cluster as well as the overlap among different clusters in an interpretable form.