首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:A Cross-Modal Image and Text Retrieval Method Based on Efficient Feature Extraction and Interactive Learning CAE
  • 本地全文:下载
  • 作者:Xiuye Yin ; Liyong Chen
  • 期刊名称:Scientific Programming
  • 印刷版ISSN:1058-9244
  • 出版年度:2022
  • 卷号:2022
  • DOI:10.1155/2022/7314599
  • 语种:English
  • 出版社:Hindawi Publishing Corporation
  • 摘要:In view of the complexity of the multimodal environment and the existing shallow network structure that cannot achieve high-precision image and text retrieval, a cross-modal image and text retrieval method combining efficient feature extraction and interactive learning convolutional autoencoder (CAE) is proposed. First, the residual network convolution kernel is improved by incorporating two-dimensional principal component analysis (2DPCA) to extract image features and extracting text features through long short-term memory (LSTM) and word vectors to efficiently extract graphic features. Then, based on interactive learning CAE, cross-modal retrieval of images and text is realized. Among them, the image and text features are respectively input to the two input terminals of the dual-modal CAE, and the image-text relationship model is obtained through the interactive learning of the middle layer to realize the image-text retrieval. Finally, based on Flickr30K, MSCOCO, and Pascal VOC 2007 datasets, the proposed method is experimentally demonstrated. The results show that the proposed method can complete accurate image retrieval and text retrieval. Moreover, the mean average precision (MAP) has reached more than 0.3, the area of precision-recall rate (PR) curves are better than other comparison methods, and they are applicable.
国家哲学社会科学文献中心版权所有