首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:Relevant Data Clustering In Web Search Engine
  • 本地全文:下载
  • 作者:N.NAGAKUMARI ; P.SRIVALLI ; K.SATYA TEJ
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2011
  • 卷号:2
  • 期号:5
  • 页码:2464-2466
  • 出版社:TechScience Publications
  • 摘要:As the number of web pages grows in informational retrieval engines we did not find the relavent documents ,so by using clustering concept we can find relavant documents . The main purpose of clustering techniques is to partitionate a set of entities into different groups, called clusters. These groups may be consistent in terms of similarity of its members. As the name suggests, the representative-based clustering techniques uses some form of representation for each cluster. Thus, every group has a member that represents it. The main use is to reducing the cost of the algorithm, the use of representatives makes the process easier to understand. clustering process is done by using k-means algorithm here in k-means there are lot of disadvantages ,it works very slow and it is not applicable for large databases.so fastgreedy k-means algorithm is used, it overcomes the drawbacks of k-means algorithm. but it is a limitation when the algorithm is used for large number of data points, So we introduce an efficient method to Compute the distortion for this algorithm.
  • 关键词:—Document clustering; k-means;Fast k-means;algorithm
国家哲学社会科学文献中心版权所有