首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Near Duplicate Document Detection Survey
  • 本地全文:下载
  • 作者:Bassma S. Alsulami ; Maysoon F. Abulkhair ; Fathy E. Eassa
  • 期刊名称:International Journal of Computer Science and Communication Networks
  • 电子版ISSN:2249-5789
  • 出版年度:2012
  • 卷号:2
  • 期号:2
  • 页码:147-151
  • 出版社:Technopark Publications
  • 摘要:Search engines are the major breakthrough on the web for retrieving the information. But List of retrieved documents contains a high percentage of duplicated and near document result. So there is the need to improve the performance of search results. Some of current search engine use data filtering algorithm which can eliminate duplicate and near duplicate documents to save the users’ time and effort. The identification of similar or near-duplicate pairs in a large collection is a significant problem with wide-spread applications. In this paper survey present an up-to-date review of the existing literature in duplicate and near duplicate detection in Web
  • 关键词:Duplicate document; near duplicate pages; near duplicate detection; Detection approaches
国家哲学社会科学文献中心版权所有