Paper accepted at MDPI Electronics

Our paper “Controlling Teleportation-Based Locomotion in Virtual Reality with Hand Gestures: A Comparative Evaluation of Two-Handed and One-Handed Techniques” got accepted at MDPI Electronics for a Special Issue on Recent Advances in Virtual Reality and Augmented Reality.

Paper: (available as Open Access)
Authors: Alexander SchäferGerd ReisDidier Stricker

Abstract: Virtual Reality (VR) technology offers users the possibility to immerse and freely navigate through virtual worlds. An important component for achieving a high degree of immersion in VR is locomotion. Often discussed in the literature, a natural and effective way of controlling locomotion is still a general problem which needs to be solved. Recently, VR headset manufacturers have been integrating more sensors, allowing hand or eye tracking without any additional required equipment. This enables a wide range of application scenarios with natural freehand interaction techniques where no additional hardware is required. This paper focuses on techniques to control teleportation-based locomotion with hand gestures, where users are able to move around in VR using their hands only. With the help of a comprehensive study involving 21 participants, four different techniques are evaluated. The effectiveness and efficiency as well as user preferences of the presented techniques are determined. Two two-handed and two one-handed techniques are evaluated, revealing that it is possible to move comfortable and effectively through virtual worlds with a single hand only.

TiCAM Dataset for in-Cabin Monitoring released

As part of the research activities of DFKI Augmented Vision in the VIZTA project (, we have published the open-source dataset for automotive in-cabin monitoring with a wide-angle time-of-flight depth sensor. The TiCAM dataset represents a variety of in-car person behavior scenarios and is annotated with 2D/3D bounding boxes, segmentation masks and person activity labels. The dataset is available here The publication describing the dataset in detail is available as a preprint here:

Contacts: Jason Rambach, Jigyasa Katrolia

Paper accepted at ICRA 2021

We are delighted to announce that our paper PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN has been accepted for publication at the ICRA 2021 IEEE International Conference on Robotics and Automation which will take place from May 30 to June 5, 2021 at Xi’an, China.

Abstract: Instance segmentation of planar regions in indoor scenes benefits  visual  SLAM  and  other  applications  such  as augmented reality (AR) where scene understanding is required. Existing  methods  built  upon  two-stage  frameworks  show  satisfactory  accuracy  but  are  limited  by  low  frame  rates.  In this  work,  we  propose  a  real-time  deep  neural  architecture that  estimates  piece-wise  planar  regions  from  a  single  RGB image. Our model employs a variant of a fast single-stage CNN architecture to segment plane instances.  Considering  the  particularity of the target detected, we propose Fast Feature Non-maximum  Suppression  (FF-NMS)  to  reduce  the  suppression errors  resulted  from  overlapping  bounding  boxes  of  planes. We  also  utilize  a  Residual  Feature  Augmentation  module  in the  Feature  Pyramid  Network  (FPN)  .  Our  method  achieves significantly  higher  frame-rates  and  comparable  segmentation accuracy  against  two-stage  methods.  We automatically label over 70,000 images as ground truth from the Stanford 2D-3D-Semantics dataset. Moreover, we incorporate our method with a state-of-the-art planar SLAM and validate  its  benefits.

Authors: Yaxu Xie, Jason Raphael Rambach, Fangwen Shu, Didier Stricker



Two articles published at IEEE Access journal

We are happy to announce that two of our papers have been accepted and published in the IEEE Access journal. IEEE Access is an award-winning, multidisciplinary, all-electronic archival journal, continuously presenting the results of original research or development across all of IEEE’s fields of interest. The articles are published with open access to all readers. The research is part of the BIONIC project and was funded by the European Commission under the Horizon 2020 Programme Grant Agreement n. 826304.

“Simultaneous End User Calibration of Multiple Magnetic Inertial Measurement Units With Associated Uncertainty”
Published in: IEEE Access (Volume: 9)
Page(s): 26468 – 26483
Date of Publication: 05 February 2021
Electronic ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3057579

“Magnetometer Robust Deep Human Pose Regression With Uncertainty Prediction Using Sparse Body Worn Magnetic Inertial Measurement Units”
Published in: IEEE Access (Volume: 9)
Page(s): 36657 – 36673
Date of Publication: 26 February 2021
Electronic ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3062545

Presentation on Machine Learning and Computer Vision by Dr. Jason Rambach

On March 4th, 2021, Dr. Jason Rambach gave a talk on Machine Learning and Computer Vision at the GIZ (Deutsche Gesellschaft für Internationale Zusammenarbeit) workshop on Machine Learning and Computer Vision for Earth Observation organized by the DFKI MLT department. In the talk, the foundations of Computer Vision, Machine Learning and Deep Learning as well as current Research and Implementation challenges were presented.

Presentation by our senior researcher Dr. Jason Rambach
Agenda of the GIZ workshop on Machine Learning and Computer Vision for Earth Observation

VIZTA project: 18-month public project summary released

DFKI participates in the VIZTA project, coordinated by ST Micrelectronics, aiming  at developing innovative technologies in the field of optical sensors and laser sources for short to long-range 3D-imaging and to demonstrate their value in several key applications including automotive, security, smart buildings, mobile robotics for smart cities, and industry4.0. The 18-month public summary of the project was released, including updates from DFKI Augmented Vision on time-of-flight camera dataset recording and deep learning algorithm development for car in-cabin monitoring and smart building person counting and anomaly detection applications.

Please click here to check out the complete summary.

3 Papers accepted at VISAPP 2021

We are excited to announce that the Augmented Vision group will present 3 papers in the upcoming VISAPP 2021 Conference, February 8th-10th, 2021:

The International Conference on Computer Vision Theory and Applications (VISAPP) is part of VISIGRAPP, the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. VISAPP aims at becoming a major point of contact between researchers, engineers and practitioners on the area of computer vision application systems. Homepage:

The 3 accepted papers are:

1.  An Adversarial Training based Framework for Depth Domain Adaptation
Jigyasa Singh Katrolia, Lars Krämer, Jason Raphael Rambach, Bruno Mirbach, Didier Stricker
One sentence summary: The paper presents a GAN-based method for domain adaptation between depth images.

2. OFFSED: Off-Road Semantic Segmentation Dataset
Peter Neigel, Jason Raphael Rambach, Didier Stricker
One sentence summary: A dataset for semantic segmentation in off-road scenes for automotive applications is made publically available.

3. SALT: A Semi-automatic Labeling Tool for RGB-D Video Sequences
Dennis Stumpf, Stephan Krauß, Gerd Reis, Oliver Wasenmüller, Didier Stricker
One sentence summary: SALT proposes a simple and effective tool to facilitate the annotation process for segmentation and detection ground truth data in RGB-D video sequences.

Article at MDPI Sensors journal

We are happy to announce that our paper “SynPo-Net–Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using Synthetic Training” has been accepted for publication at the MDPI Sensors journal, Special Issue Object Tracking and Motion Analysis. Sensors (ISSN 1424-8220; CODEN: SENSC9) is the leading international peer-reviewed open access journal on the science and technology of sensors.

Abstract: Estimation and tracking of 6DoF poses of objects in images is a challenging problem of great importance for robotic interaction and augmented reality. Recent approaches applying deep neural networks for pose estimation have shown encouraging results. However, most of them rely on training with real images of objects with severe limitations concerning ground truth pose acquisition, full coverage of possible poses, and training dataset scaling and generalization capability. This paper presents a novel approach using a Convolutional Neural Network (CNN) trained exclusively on single-channel Synthetic images of objects to regress 6DoF object Poses directly (SynPo-Net). The proposed SynPo-Net is a network architecture specifically designed for pose regression and a proposed domain adaptation scheme transforming real and synthetic images into an intermediate domain that is better fit for establishing correspondences. The extensive evaluation shows that our approach significantly outperforms the state-of-the-art using synthetic training in terms of both accuracy and speed. Our system can be used to estimate the 6DoF pose from a single frame, or be integrated into a tracking system to provide the initial pose.

Authors: Yongzhi Su, Jason Raphael Rambach, Alain Pagani, Didier Stricker



Final virtual training workshop for the Erasmus+ project ArInfuse: Exploiting the potential of Augmented Reality & Geospatial Technologies within the utilities sector

After two years of collaborative work, the project ArInfuse is inviting for its final workshop on January 28th.

ARinfuse is an Erasmus+ project that aims to infuse skills in Augmented Reality for geospatial information management in the context of utility underground infrastructures, such as water, sewage, electricity, gas and fiber optics. In this field, there is a real need for an accurate positioning of the underground utilities, to avoid damages to the existing infrastructures. Information communication technologies (ICT), in fusion with global navigation satellite systems (GNSS), GIS and geodatabases and augmented/virtual reality (AR/VR) are able to offer the possibility to convert the geospatial information of the underground utilities into a powerful tool for field workers, engineers and managers.
ARinfuse is mainly addressed to technical professional profiles (future and current) in the utility sector that use, or are planning to use AR technology into practical applications of ordinary management and maintenance of utility networks.

The workshop entitled “Exploiting the potential of Augmented Reality & Geospatial Technologies within the utilities sector” is addressed to engineering students and professionals that are interested in the function, appliance and benefits of AR and geospatial technologies in the utilities sector.

The workshop will also introduce the ARinfuse catalogue of training modules on Augmented Reality and Geoinformatics applied within the utility infrastructure sector.

More information:

Contact persons: Dr. Alain Pagani and Narek Minaskan

Three papers accepted at ICPR 2020

We are proud to announce that the Augmented Vision group will present three papers in the upcoming ICPR 2020 conference which will take place from January 10th till 15th, 2021. The International Conference on Pattern Recognition (ICPR) is the premier world conference in Pattern Recognition. It covers both theoretical issues and applications of the discipline. The 25th event in this series is organized as an online virtual conference with more than 1800 participants expected.

The three accepted papers are:

1.  HPERL: 3D Human Pose Estimation from RGB and LiDAR
David Michael Fürst, Shriya T. P. Gupta, René Schuster, Oliver Wasenmüller, Didier Stricker
One sentence summary: HPERL proposes a two-stage 3D human pose detector that fuses RGB and LiDAR information for a precise localization in 3D.
Presentation date: PS T3.3, January 12th, 5 pm CET. 

2. ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid Networks for Accurate Dense Pixel Matching
Rishav, René Schuster, Ramy Battrawy, Oliver Wasenmüller, Didier Stricker
One sentence summary: ResFPN extends Feature Pyramid Networks by adding residual connections from higher resolution features maps to obtain stronger and better localized features for dense matching with deep neural networks.
This paper is accepted as an oral presentation (best 6% of all submissions).
Presentation date: OS T5.1, January 12th, 2 pm CET; PS T5.1, January 12th, 5 pm CET.

3. Ghost Target Detection in 3D Radar Data using Point Cloud based Deep Neural Network
Mahdi Chamseddine, Jason Rambach, Oliver Wasenmüller, Didier Stricker
One sentence summary: An extension to PointNet is developed and trained to detect ghost targets in 3D radar point clouds using labels by an automatic labelling algorithm.
Presentation date: PS T1.16, January 15th, 4:30 pm CET.