首页    期刊浏览 2024年12月12日 星期四
登录注册

文章基本信息

  • 标题:Bi-Directional LSTM Networkを用いた発話に伴うジェスチャの自動生成手法
  • 本地全文:下载
  • 作者:金子 直史 ; 竹内 健太 ; 長谷川 大
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2019
  • 卷号:34
  • 期号:6
  • 页码:1-12
  • DOI:10.1527/tjsai.C-J41
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:

    We present a novel framework for automatic speech-driven natural gesture motion generation. The proposed method consists of two steps. First, based on Bi-Directional LSTM Network, our deep network learns speech-gesture relationships with both forward and backward consistencies for a long period of time. The network regresses full 3D skeletal pose of a human from perceptual features extracted from the input audio in each time step. Second, we apply combined temporal filters to smooth out generated pose sequences. We utilize a speech-gesture dataset recorded with a headset and a marker-based motion capture to train our network. We evaluate different acoustic features, network architectures, and temporal filters in order to validate the effectiveness of the proposed approach. We also conduct a subjective evaluation and compare our approach against real human gestures. The subjective evaluation result shows that our generated gestures are comparable to “original” human gestures and are significantly better than “mismatched” human gestures taken from a different utterance in the scale of naturalness.

  • 关键词:speech; gesture;deep learning;neural networks;long short-term memory
国家哲学社会科学文献中心版权所有