首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:On Improved Example-based Search in Digital Libraries via Term Ranking
  • 本地全文:下载
  • 作者:Sulieman Bani-ahmad ; Ghadeer Al-dweik
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2010
  • 卷号:19
  • 期号:01
  • 出版社:Journal of Theoretical and Applied
  • 摘要:

    Example-based searching, where user provides an example publication to locate similar publications to, is becoming commonplace in literature digital libraries. Two approaches to estimate similarities between publications are (i) graph based approaches where citation relationships amongst publication are used to compute similarities, and (ii) text-based approaches where observing shared terms between publications is used as indicator of similarity. In this paper we introduce a new text-based publication-similarity measuring technique that enhances existing example-based searching through utilizing term importance information. Term importance is computed via a proposed graph-based term ranking (GBTR) algorithm. The GBTR algorithm is different from previous term ranking approaches as it recursively computes term importance from the entire publication where it is observed, rather than relying only on local specific information. GBTR works well when paired with Okapi BM25. We exhaustively evaluate the performance of GBTR and compare it against the performance of existing term-ranking methods such as the Chronological Term Rank (CTR) and the Term Proximity models. Significant improvements, in terms of precision, over existing approaches are observed. GBTR achieved around 10% improvement in precision over CTR and around 2% over TP with much less computational time and space complexity than the TP approach.

  • 关键词:Okapi system; BM25; Text retrieval; Example-based search; TextRank; Term Proximity.
国家哲学社会科学文献中心版权所有