期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2021
卷号:2021
页码:659-668
DOI:10.18653/v1/2021.eacl-main.54
语种:English
出版社:ACL Anthology
摘要:This paper studies the problem of generatinglikely queries for multimodal documents withimages. Our application scenario is enablingefficient “first-stage retrieval” of relevant doc-uments, by attaching generated queries to doc-uments before indexing. We can then indexthis expanded text to efficiently narrow downto candidate matches using inverted index, sothat expensive reranking can follow. Our eval-uation results show that our proposed multi-modal representation meaningfully improvesrelevance ranking.More importantly, ourframework can achieve the state of the art inthe first stage retrieval scenarios.