A Probablistic Combination of CNN and RNN Estimates for Hand Gesture Based Interaction in Car

A Probablistic Combination of CNN and RNN Estimates for Hand Gesture Based Interaction in Car
Aditya Tewari, Bertram Taetz, Frederic Grandidier, Didier Stricker
16th IEEE International Symposium on Mixed and Augmented Reality (ISMAR) IEEE International Symposium on Mixed and Augmented Reality (ISMAR-17), October 9-13, Nantes, France

Abstract:
Hand Gesture Recognition is completed on top-view hand images observed by a Time of Flight(ToF) camera in a car. The work attempts to solve two important problems of touchless interactions inside a car. First, low latency identification of the gestures which are unobtrusive for the driver. Second, reducing the labelled data required for training learning based solutions, this is particularly important because labelling of gesture sequences is expensive and exigent. This work attempts to improve the fast detection of hand-gestures by correcting probability estimate of a Long Short Term Memory(LSTM ) network by pose prediction made by a Convolutional Neural Network(CNN). Weak models for hand gesture classes based on five hand poses are designed to assist in the prediction-correction scheme. A training procedure to reduce the labelled data required for hand pose classification is also introduced. This method tries to utilise the statistical property of the dataset to identify a good initialization of weights for the CNN, here we demonstrate this using the Principal Component Analysis(PCA) embedding of nonlabelled hand pose sequences. While solving a nine class hand gesture problem we demonstrate an accuracy of 89.50% and show that this performs better than existing systems. We also show that a PCA embedding based initialization improves the classification performance of the CNN based pose classifier.
Keywords:
Computer Vision; Hand Gestures; Activity recognition and understanding; Neural nets— LSTM,CNN;