首页    期刊浏览 2024年12月04日 星期三
登录注册

文章基本信息

  • 标题:MathIRs: Retrieval System for Scientific Documents
  • 本地全文:下载
  • 作者:Amarnath Pathak ; Partha Pakray ; Sandip Sarkar
  • 期刊名称:Computación y Sistemas
  • 印刷版ISSN:1405-5546
  • 出版年度:2017
  • 卷号:21
  • 期号:2
  • 页码:253-265
  • 语种:English
  • 出版社:Instituto Politécnico Nacional
  • 其他摘要:Effective retrieval of mathematical contents from vast corpus of scientific documents demands enhancement in the conventional indexing and searching mechanisms. Indexing mechanism and the choice of semantic similarity measures guide the results of Math Information Retrieval system (MathIRs) to perfection. Tokenization and formula unification are among the distinguishing features of indexing mechanism, used in MathIRs, which facilitate sub-formula and similarity search. Besides, the scientific documents and the user queries in MathIRs will contain math as well as text contents and to match these contents we require three important modules: Text-Text Similarity (TS), Math-Math Similarity (MS) and Text-Math Similarity (TMS). In this paper we have proposed MathIRs comprising these important modules and a substitution tree based mechanism for indexing mathematical expressions. We have also presented experimental results for similarity search and argued that proposal of MathIRs will ease the task of scientific document retrieval.
  • 其他关键词:Natural language processing; information retrieval; MathIRs; indexing.
国家哲学社会科学文献中心版权所有