首页    期刊浏览 2025年02月28日 星期五
登录注册

文章基本信息

  • 标题:Video captioning in Vietnamese using deep learning
  • 本地全文:下载
  • 作者:Dang Thi Phuc ; Tran Quang Trieu ; Nguyen Van Tinh
  • 期刊名称:International Journal of Electrical and Computer Engineering
  • 电子版ISSN:2088-8708
  • 出版年度:2022
  • 卷号:12
  • 期号:3
  • 页码:3092-3103
  • DOI:10.11591/ijece.v12i3.pp3092-3103
  • 语种:English
  • 出版社:Institute of Advanced Engineering and Science (IAES)
  • 摘要:With the development of today's society, demand for applications using digital cameras jumps over year by year. However, analyzing large amounts of video data causes one of the most challenging issues. In addition to storing the data captured by the camera, intelligent systems are required to quickly analyze the data to correct important situations. In this paper, we use deep learning techniques to build automatic models that describe movements on video. To solve the problem, we use three deep learning models: sequence-to-sequence model based on recurrent neural network, sequence-to-sequence model with attention and transformer model. We evaluate the effectiveness of the approaches based on the results of three models. To train these models, we use microsoft research video description corpus (MSVD) dataset including 1970 videos and 85,550 captions translated into Vietnamese. In order to ensure the description of the content in Vietnamese, we also combine it with the natural language processing (NLP) model for Vietnamese.
  • 关键词:attention;natural language processing;sequence-to-sequence model;transformer;video caption
国家哲学社会科学文献中心版权所有