期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2021
卷号:2021
页码:2727-2733
DOI:10.18653/v1/2021.eacl-main.235
语种:English
出版社:ACL Anthology
摘要:A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of generated summaries and we show that the entity hallucination problem can be alleviated by simply filtering the training data. In addition, we propose a summary-worthy entity classification task to the training process as well as a joint entity and summary generation approach, which yield further improvements in entity level metrics.