Two Phase Classification for Early Hand Gesture Recognition in 3D Top View Data

Two Phase Classification for Early Hand Gesture Recognition in 3D Top View Data
Aditya Tewari, Bertram Taetz, Frédéric Grandidier, Didier Stricker
International Symposium on Visual Computing : Advances in Visual Computing International Conference on Visual Computing (ISVC-16), Advances in Visual Computing, December 12-14, Las Vegas, Nevada, USA

Abstract:
This work classifies top-view hand-gestures observed by a Time of Flight (ToF) camera using Long Short-Term Memory (LSTM) architecture of neural networks. We demonstrate a performance improvement by a two-phase classification. Therefore we reduce the number of classes to be separated in each phase and combine the output probabilities. The modified system architecture achieves an average cross-validation accuracy of 90.75% on a 9-gesture dataset. This is demonstrated to be an improvement over the single all-class LSTM approach. The networks are trained to predict the class-label continuously during the sequence. A frame-based gesture prediction, using accumulated gesture probabilities per frame of the video sequence, is introduced. This eliminates the latency due to prediction of gesture at the end of the sequence as is usually the case with majority voting based methods.