Real-Time Energy Efficient Hand Pose Estimation: A Case Study

Real-Time Energy Efficient Hand Pose Estimation: A Case Study
Mhd Rashed Al Koutayni, Vladimir Rybalkin, Muhammad Jameel Nawaz Malik, Ahmed Elhayek, Christian Weis, Gerd Reis, Norbert Wehn, Didier Stricker
Sensors - Open Access Journal (sensors) 20 Seiten 1-27 MDPI 5/2020 .

Abstract:
The estimation of human hand pose has become the basis for many vital applications where the user depends mainly on the hand pose as a system input. Virtual reality (VR) headset, shadow dexterous hand and in-air signature verification are a few examples of applications that require to track the hand movements in real-time. The state-of-the-art 3D hand pose estimation methods are based on the Convolutional Neural Network (CNN). These methods are implemented on Graphics Processing Units (GPUs) mainly due to their extensive computational requirements. However, GPUs are not suitable for the practical application scenarios, where the low power consumption is crucial. Furthermore, the difficulty of embedding a bulky GPU into a small device prevents the portability of such applications on mobile devices. The goal of this work is to provide an energy efficient solution for an existing depth camera based hand pose estimation algorithm. First, we compress the deep neural network model by applying the dynamic quantization techniques on different layers to achieve maximum compression without compromising accuracy. Afterwards, we design a custom hardware architecture. For our device we selected the FPGA as a target platform because FPGAs provide high energy efficiency and can be integrated in portable devices. Our solution implemented on Xilinx UltraScale+ MPSoC FPGA is 4.2x faster and 577.3x more energy efficient than the original implementation of the hand pose estimation algorithm on NVIDIA GeForce GTX 1070.