Real-Time Head Pose Estimation Using Multi-Variate RVM on Faces in the Wild

Real-Time Head Pose Estimation Using Multi-Variate RVM on Faces in the Wild
Mohamed Selim, Alain Pagani, Didier Stricker
International Conference on Computer Analysis of Images and Patterns (CAIP-2015), 16th, September 2-4, Valletta, Malta

Abstract:
Various computer vision problems and applications rely on an accurate, fast head pose estimator. We model head pose estimation as a regression problem. We show that it is possible to use the appearance of the facial image as a feature which depicts the pose variations. We use a parametrized Multi-Variate Relevance Vector Machine (MVRVM) to learn the three rotation angles of the face (yaw, pitch, and roll). The input of the MVRVM is normalized mean pixel intensities of the face patches, and the output is the three head rotation angles. We evaluated our approach on the challenging YouTube faces dataset. We achieved a head pose estimation with an average error tolerance of 6.5 degrees in the yaw rotation angle, and less than 2.5 degrees in both the pitch and roll angles. The time taken in one prediction is 2-3 milliseconds, hence suitable for real-time applications.
Keywords:
Head Pose Estimation, Real-time, MVRVM, Faces in the Wild

Real-Time Head Pose Estimation Using Multi-Variate RVM on Faces in the Wild

Real-Time Head Pose Estimation Using Multi-Variate RVM on Faces in the Wild
George Azzopardi, Nicolai Petkov (Hrsg.)
Computer Analysis of Images and Patters International Conference on Computer Analysis of Images and Patterns (CAIP-2015), 16th, September 2-4, Valletta, Malta

Abstract:
Various computer vision problems and applications rely on an accurate, fast head pose estimator. We model head pose estimation as a regression problem. We show that it is possible to use the appearance of the facial image as a feature which depicts the pose variations. We use a parametrized Multi-Variate Relevance Vector Machine (MVRVM) to learn the three rotation angles of the face (yaw, pitch, and roll). The input of the MVRVM is normalized mean pixel intensities of the face patches, and the output is the three head rotation angles. We evaluated our approach on the challenging YouTube faces dataset. We achieved a head pose estimation with an average error tolerance of 6.5 degrees in the yaw rotation angle, and less than 2.5 degrees in both the pitch and roll angles. The time taken in one prediction is 2-3 milliseconds, hence suitable for real-time applications.
Keywords:
Head Pose Estimation, Real-time, MVRVM, Faces in the Wild