首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:Textual Coherence Improvement of Extractive Document Summarization Using Greedy Approach and Word Vectors
  • 本地全文:下载
  • 作者:Mohamad Abdolahi ; Morteza Zahedi
  • 期刊名称:International Journal of Modern Education and Computer Science
  • 印刷版ISSN:2075-0161
  • 电子版ISSN:2075-017X
  • 出版年度:2019
  • 卷号:11
  • 期号:4
  • 页码:23-31
  • DOI:10.5815/ijmecs.2019.04.03
  • 出版社:MECS Publisher
  • 摘要:There is a growing body of attention to importance of document summarization in most NLP tasks. So far, full coverage information, coherence of output sentences and lack of similar sentences (non-redundancy) are the main challenges faced to many experiments in compacted summaries. Although some research has been carried out on compact summaries, there have been few empirical investigations into coherence of output sentences. The aim of this essay is to explore a comprehensive and useful methodology to generate coherent summaries. The methodological approach taken in this study is a mixed method based on most likely n-grams and word2vec algorithm to convert separated sentences into numeric and normalized matrices. This paper attempts to extract statistical properties from numeric matrices. Using a greedy approach, the most relevant sentences to main document subject are selected and placed in the output summary. The proposed greedy method is our backbone algorithm, which utilizes a repeatable algorithm, maximizes two features of conceptual coherence and subject matter diversity in the summary. Suggested approach compares its result to similar model Q_Network and shows the superiority of its algorithm in confronting with long text document.
  • 关键词:Natural language processing;Extractive summarization;Text coherence;Word vector;Language models
国家哲学社会科学文献中心版权所有