首页    期刊浏览 2024年12月14日 星期六
登录注册

文章基本信息

  • 标题:Speech Emotion Recognition Model Based on Attention CNN Bi-GRU Fusing Visual Information
  • 本地全文:下载
  • 作者:Zhangfang Hu ; Lan Wang ; Yuan Luo
  • 期刊名称:Engineering Letters
  • 印刷版ISSN:1816-093X
  • 电子版ISSN:1816-0948
  • 出版年度:2022
  • 卷号:30
  • 期号:2
  • 页码:427-434
  • 语种:English
  • 出版社:Newswood Ltd
  • 摘要:The problem of low recognition accuracy of emotion recognition models is easily caused by interference such as data redundancy and irrelevant features. In this paper, we propose a speech emotion recognition (SER) method based on an attentional convolutional neural network (CNN) bidirectional gated recurrent unit (Bi-GRU) fusing visual information. First, we pretrained the log-mel spectrograms in a ResNet-based attentional convolutional neural network (RACNN) to extract speech features. Second, the CNN-extracted facial static appearance features are fused with speech features using a deep Bi-GRU to obtain speech appearance features. A series of gated recurrent units with attention mechanisms (AGRUs) are used to extract facial geometric features. Then, the hybrid features are obtained by further combining the integrated speech appearance features with facial geometric features, and kernel linear discriminant analysis (KLDA) is used to discriminate them. Finally, the proposed method in this paper obtained accuracies of 87.92% and 89.65% on the RAVDESS and eNTERFACE'05 emotion databases, respectively. The experimental results demonstrate that the method in this paper effectively improved the accuracy and robustness of SER.
  • 关键词:SER;visual information;Bi-GRU;AGRUs;KLDA
国家哲学社会科学文献中心版权所有