摘要:The static hand gesture recognition under simpleor complex background has become mature, but the dynamichand gesture recognition against simple background is stillchallenging. The reason is that we need pay attention to notonly spatial features but also temporal features in dynamichand gesture recognition. In this paper, we mainly researchthe drivers’ dynamic hand gesture recognition applied inautomotive users interfaces to ensure the safe and improve thecomfort. We use 3D convolutional neural networks(3DCNN) toclassify the dynamic hand gesture, which can effectively extracttemporal-spatial features. Our contributions have three aspects:1)We extract the key frames from the origin video by computingthe optical flow of two adjacent frames; 2)We propose anew data augmentation method–data temporal cropping, whichcan improve the recognition accuracy and prevent overfittingeffectively. 3)We propose a novel multi-direction convolutionalneural networks(mdCNN), which can extract the distinguishedtemporal-spatial features from the short videos. In the end, wepresent experiment results based on VIVA dataset and achieverecognition accuracy of 65.35%, which is higher than othermethods.
关键词:Hand gesture recognition; Temporal-Spatial;features; Automotive users interfaces; 3D convolutional neural;networks