期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2021
卷号:2021
页码:989-999
DOI:10.18653/v1/2021.eacl-main.85
语种:English
出版社:ACL Anthology
摘要:The Internet contains a multitude of social media posts and other of stories where text is interspersed with images. In these contexts, images are not simply used for general illustration, but are judiciously placed in certain spots of a story for multimodal descriptions and narration. In this work we analyze the problem of text-image alignment, and present SANDI, a methodology for automatically selecting images from an image collection and aligning them with text paragraphs of a story. SANDI combines visual tags, user-provided tags and background knowledge, and uses an Integer Linear Program to compute alignments that are semantically meaningful. Experiments show that SANDI can select and align images with texts with high quality of semantic fit.