首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Improving Document Relevancy Using Integrated Language Modeling Techniques
  • 本地全文:下载
  • 作者:Vimala Balakrishnan ; Norshima Humaidi ; Ethel LloydYemoh
  • 期刊名称:Malaysian Journal of Computer Science
  • 印刷版ISSN:0127-9084
  • 出版年度:2016
  • 卷号:29
  • 期号:1
  • 出版社:University of Malaya * Faculty of Computer Science and Information Technology
  • 摘要:This paper presents an integrated language model to improve document relevancy for textqueries. To be precise, an integrated stemminglemmatization (SL) model was developed and its retrieval performance was compared at three document levels, that is, at top 5, 10 and 15. A prototype search engine was developed and fifteen queries were executed. The mean average precisions revealed the SL model to outperform the baseline (i.e. no language processing), stemming and also the lemmatization models at all three levels of the documents. These results were also supported by the histogram precisions which illustrated the integrated model to improve the document relevancy. However, it is to note that the precision differences between the various models were insignificant. Overall the study found that when language processing techniques, that is, stemming and lemmatization are combined, more relevant documents are retrieved.
  • 关键词:Information retrieval; document relevancy; language modeling; stemming; lemmatization; mean average precision
国家哲学社会科学文献中心版权所有