首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:A Relevant Document Information Clustering Algorithm for Web Search Engine
  • 本地全文:下载
  • 作者:Y.SureshBabu ; K.Venkat Mutyalu ; Y.A.Siva Prasad
  • 期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
  • 印刷版ISSN:2278-1323
  • 出版年度:2012
  • 卷号:1
  • 期号:8
  • 页码:16-20
  • 出版社:Shri Pannalal Research Institute of Technolgy
  • 摘要:Search engines are the Hub of Information, The advances in computing and information storage have provided vast amount of Data, the users of World Wide Web is increasingly day by day, It is become more difficult to users get the required information according to their interests. The IR community has explored document clustering as an alternative method of organizing retrieval results, so by using clustering concept we can find the grouped relevant documents. The purpose of clustering is to partitioning the set of entities into different groups called clusters. These groups may consistent in terms of similarity of its members. As the name suggests, the representative based clustering techniques uses some form of representation for each cluster. Thus every group has a member that represents it. The main use is to increase the efficiency of the algorithm and to decrease the cost of the algorithm. Clustering process is done by using k-means partitioning algorithms and Hierarchical clustering algorithms but there are lot of disadvantages, it works very slow and it is not applicable for large databases. So fast greedy k -means algorithm is used it overcomes the drawbacks of k-means algorithm and it is very much accurate and efficient. So we introduce an efficient method to calculate the distortion for this algorithm. This helps the users to find the relevant documents more easily than by relevance ranking.
  • 关键词:Information retrieval; K-Means; Fast k-means; ; Document clustering; Web clustering
国家哲学社会科学文献中心版权所有