Principles of Object Tracking and Mapping

Principles of Object Tracking and Mapping
Jason Raphael Rambach, Alain Pagani, Didier Stricker
In: Andrew Yeh Ching Nee; Soh Khim Ong. Springer Handbook of Augmented Reality. Pages 53-84, Springer Handbooks, ISBN 978-3-030-67821-0, Springer, Switzerland, 1/2023.

Abstract:
Tracking is the main enabling technology for Augmented Reality (AR) as it allows realistic placement of virtual content in the real world. In this chapter, we discuss the most important aspects of tracking for AR while reviewing existing systems that shaped the field over the past years. Initially, we provide a notation for the description of 6 Degree of Freedom (6DoF) poses and camera models. Subsequently, we describe fundamental computer vision techniques that tracking systems frequently use such as feature matching and tracking or pose estimation. We divide the description of tracking approaches into model-based approaches and Simultaneous Localization and Mapping (SLAM) approaches. Model-based approaches use a synthetic representation of an object as a template in order to match the real object. This matching can use texture or lines as tracking features in order to establish correspondences from the models to the image, whereas machine learning approaches for direct pose estimation of an object from an input image have also been recently introduced. Currently, an upcoming challenge is the extension of tracking systems for AR from rigid objects to articulated and nonrigid objects. SLAM tracking systems do not require any models as a reference as they can simultaneously track and map their environment. We discuss keypoint-based, direct, and semi-direct purely visual SLAM system approaches. Next, we analyze the use of additional sensors that can support tracking such as visual-inertial sensor fusion techniques or depth sensing. Finally, we also look at the use of machine learning techniques and especially the use of deep neural networks in conjunction with traditional computer vision approaches for SLAM.