News Archive

News

Start des Projektes „DECODE”

KI zur Erkennung menschlicher Bewegungen und des Umfeldes

Adaptive Methoden die kontinuierlich dazu lernen (Lebenslanges Lernen), bilden eine zentrale Herausforderung zur Entwicklung von robusten, realitätsnahen KI-Anwendungen. Neben der reichen Historie auf dem Gebiet des allgemeinen kontinuierlichen Lernens („Continual Learning“) hat auch das Themenfeld von kontinuierlichem Lernen für Machinelles Sehen unter Realbedingungen jüngst an Interesse gewonnen.

Ziel des Projektes DECODE ist die Erforschung von kontinuierlich adaptierfähigen Modellen zur Rekonstruktion und dem Verständnis von menschlicher Bewegung und des Umfeldes in anwendungsbezogenen Umgebungen. Dazu sollen mobile, visuelle und inertiale Sensoren (Beschleunigungs- und Drehratensensoren) verwendet werden. Für diese verschiedenen Typen an Sensoren und Daten sollen unterschiedliche Ansätze aus dem Bereich des Continual Learnings erforscht und entwickelt werden um einen problemlosen Transfer von Laborbedingungen zu alltäglichen, realistischen Szenarien zu gewährleisten. Dabei konzentrieren sich die Arbeiten auf die Verbesserung in den Bereichen der semantischen Segmentierung von Bildern und Videos, der Schätzung von Kinematik und Pose des menschlichen Körpers sowie der Repräsentation von Bewegungen und deren Kontext. Das Feld potentieller Anwendungsgebiete für die in DECODE entwickelten Methoden ist weitreichend und umfasst eine detaillierte ergonomische Analyse von Mensch-Maschine Interaktionen zum Beispiel am Arbeitsplatz, in Fabriken, oder in Fahrzeugen.

Weitere Informationen: https://www.dfki.de/web/forschung/projekte-publikationen/projekte-uebersicht/projekt/decode

Contact: René Schuster

Tewodros Amberbir Habtegebrial honored with Google PhD Fellowship

Mr. Habtegebrial is a PhD student at the Augmented Vision research department at the German Research Center for Artificial Intelligence (DFKI) and at the same named lab at the Technical University of Kaiserslautern (TUK). He was awarded the Google PhD Fellowship for his exceptional and innovative research in the field of “Machine Perception“. The PhD fellowship is endowed with 80,000 US dollars. Google also provides each of the PhD students with a research mentor.

Professor Didier Stricker, Tewodros’ PhD supervisor and head of the respective research areas at TUK and DFKI on the award for his PhD student: “I am very pleased that Tewodros received a PhD Fellowship from Google. He earned the honor through his outstanding achievements in his research work in Machine Perception and Image Synthesis.”

As part of his PhD studies Mr Habtegebrial has been working on Image-Based Rendering (IBR). Recently, he has worked on a technique that enables Neural Networks to render realistic novel views, given a single 2D semantic map of the scene. The approach has been published together with google and Nvidia at the pemium conference Neurips 2020. In collaboration with researchers at DFKI and Google research, he is working on spherical light-field interpolation and realistic modelling of reflective surfaces in IBR. This enables the implementation of new applications in the field of realistic virtual reality (VR) and telepresence. In addition to his PhD, topic he has co-authored several articles on Optical Character Recognition (OCR) for Amharic language, which is the official language of Ethiopia.

Further information:
https://research.google/outreach/phd-fellowship/recipients/
https://www.dfki.de/en/web/news/google-phd-fellowship
https://www.uni-kl.de/pr-marketing/news/news/tewodros-amberbir-habtegebrial-mit-google-phd-fellowship-ausgezeichnet

Hitachi presents research with the German Research Center for Artificial Intelligence (DFKI)

Hitachi and DFKI have been collaborating on various research projects for many years. Hitachi is now presenting joint current research with DFKI, the AG wearHEALTH at the Technical University of Kaiserslautern (TUK), Xenoma Inc. and sci-track GmbH, a joint spin-off of DFKI and TUK, in the field of occupational safety in a video.

© Hitachi

The partners have jointly developed wearable AI technology that supports the monitoring of workers’ physical workload, the capturing of workflows and can be used to optimize them in terms of efficiency, occupational safety and health. Sensors are loosely integrated into normal working clothes to measure the pose and movements of the body segments. A new approach to handle cloth induced artefakts allows full wearing comfort and high capturing accuracy and reliability.  
 
Hitachi and DFKI will use the new solution to support worker and prevent dangerous poses to create a more efficient and safe working environment, while supporting full wearing comfort of any clothes.
 
Hitachi is a Principal Partner of the 2021 UN Climate Change Conference, known internationally as COP26, where it will present a video of its joint collaboration with DFKI, among other projects.
 
Further information:
Solution to visualize workers’ loads – Hitachi – YouTube
https://www.dfki.de/en/web/news/hitachi

Contact:  Prof. Dr. Didier Stricker

Hitachi, Ltd. (TSE: 6501), headquartered in Tokyo, Japan, is contributed to a sustainable society with a higher quality of life by driving innovation through data and technology as the Social Innovation Business. Hitachi is focused on strengthening its contribution to the Environment, the Resilience of business and social infrastructure as well as comprehensive programs to enhance Security & Safety. Hitachi resolves the issues faced by customers and society across six domains: IT, Energy, Mobility, Industry, Smart Life and Automotive Systems through its proprietary Lumada solutions. The company’s consolidated revenues for fiscal year 2020 (ended March 31, 2021) totaled 8,729.1 billion yen ($78.6 billion), with 871 consolidated subsidiaries and approximately 350,000 employees worldwide. Hitachi is a Principal Partner of COP26, playing a leading role in the efforts to achieve a Net Zero society and become a climate change innovator. Hitachi strives to achieve carbon neutrality at all its business sites by fiscal year 2030 and across the company’s entire value chain by fiscal year 2050. For more information on Hitachi, please visit the company’s website at https://www.hitachi.com.

Medica 2021: Better posture at the workplace thanks to new sensor technology

Whether pain in the back, shoulders or knees: Incorrect posture in the workplace can have consequences. A sensor system developed by researchers at the German Research Centre for Artificial Intelligence (DFKI) and TU Kaiserslautern might be of help. Sensors on the arms, legs and back, for example, detect movement sequences and software evaluates the data obtained. The system provides the user with direct feedback via a Smartwatch so that he can correct movement or posture. The sensors could be installed in working clothes and shoes. The researchers have presented this technology at the medical technology trade fair Medica held from November 15th to 18th, 2021 at the Rhineland-Palatinate research stand (hall 3, stand E80).

Assembling components in a bent posture, regularly putting away heavy crates on shelves or quickly writing an e-mail to a colleague on the computer – during work most people do not pay attention to an ergonomically sensible posture or a gentle sequence of movements. This can result in back pain that may well occur several times a month or week and develop into chronic pain over time. However, incorrect posture can also lead to permanent pain in the hips, neck or knees.

A technology currently being developed by a research team at DFKI and Technische Universität Kaiserslautern (TUK) can provide a remedy in the future. Sensors are used that are simply attached to different parts of the body such as arms, spine and legs. “Among other things, they measure accelerations and so-called angular velocities. The data obtained is then processed by our software,” says Markus Miezal from the wearHEALTH working group at TUK. On this basis, the software calculates motion parameters such as joint angles at arm and knee or the degree of flexion or twisting of the spine. “The technology immediately recognizes if a movement is performed incorrectly or if an incorrect posture is adopted,” continues his colleague Mathias Musahl from the Augmented Vision/Extended Reality research unit at the DFKI.

The Smartwatch is designed to inform the user directly in order to correct his movement or posture. Among other things, the researchers plan to install the sensors in work clothing and shoes. This technology is interesting, for example, for companies in industry, but it can also help to pay more attention to one’s own body in everyday office life at a desk.

All of this is part of the BIONIC project, which is funded by the European Union. BIONIC stands for “Personalized Body Sensor Networks with Built-In Intelligence for Real-Time Risk Assessment and Coaching of Ageing workers, in all types of working and living environments”. It is coordinated by Professor Didier Stricker, head of the Augmented Vision/Extended Reality research area at DFKI. The aim is to develop a sensor system with which incorrect posture and other stresses at the workplace can be reduced.

In addition to the DFKI and the TUK, the following are involved in the project: the Federal Institute for Occupational Safety and Health (BAuA) in Dortmund, the Spanish Instituto de Biomechanica de Valencia, the Fundación Laboral de la Construcción, also in Spain, the Roessingh Research and Development Centre at the University of Twente in the Netherlands, the Systems Security Lab at the Greek University of Piraeus, Interactive Wear GmbH in Munich, Hypercliq IKE in Greece, ACCIONA Construcción S.A. in Spain and Rolls-Royce Power Systems AG in Friedrichshafen.

Further information:
Website BIONIC
Video

Contact: Markus Miezal, Dipl.-Ing. Mathias Musahl

Related news: 02/26/2019 Launch of new EU project –“BIONIC,” an intelligent sensor network designed to reduce the physical demands at the workplace

DFKI AV – Stellantis Collaboration on Radar-Camera Fusion – 2 publications

DFKI Augmented Vision is working with Stellantis on the topic of Radar-Camera Fusion for Automotive Object Detection using Deep Learning since 2020. The collaboration has already led to two publications, in ICCV 2021 (International Conference on Computer Vision – ERCVAD Workshop on “Embedded and Real-World Computer Vision in Autonomous Driving”) and WACV 2022 (Winter Conference on Applications of Computer Vision).

The 2 publications are:

1.  Deployment of Deep Neural Networks for Object Detection on Edge AI Devices with Runtime OptimizationProceedings of the IEEE International Conference on Computer Vision Workshops – ERCVAD Workshop on Embedded and Real-World Computer Vision in Autonomous Driving

Lukas Stefan Stäcker, Juncong Fei, Philipp Heidenreich, Frank Bonarens, Jason Rambach, Didier Stricker, Christoph Stiller

This paper discusses the optimization of neural network based algorithms for object detection based on camera, radar, or lidar data in order to deploy them on an embedded system on a vehicle.

2. Fusion Point Pruning for Optimized 2D Object Detection with Radar-Camera FusionProceedings of the IEEE Winter Conference on Applications of Computer Vision, 2022

Lukas Stefan Stäcker, Juncong Fei, Philipp Heidenreich, Frank Bonarens, Jason Rambach, Didier Stricker, Christoph Stiller

This paper introduces fusion point pruning, a new method to optimize the selection of fusion points within the deep learning network architecture for radar-camera fusion.

Please view the abstract here: Fusion Point Pruning for Optimized 2D Object Detection with Radar-Camera Fusion (dfki.de)

Contact: Dr. Jason Rambach

GreifbAR Projekt – Greifbare Realität – Interaktion mit realen Werkzeugen in Mixed-Reality Welten

Am 01.10.2021 ist das Forschungsprojekt Projekt GreifbAR gestartet unter Leitung des DFKI (Forschungsbereich Erweiterte Realität). Ziel des Projekts GreifbAR ist es, Mixed-Reality Welten (MR), einschließlich virtueller (VR) und erweiterter Realität („Augmented Reality“ – AR), greifbar und fassbar zu machen, indem die Nutzer mit bloßen Händen mit realen und virtuellen Objekten interagieren können. Die Genauigkeit und Geschicklichkeit der Hand ist für die Ausführung präziser Aufgaben in vielen Bereichen von größter Bedeutung, aber die Erfassung der Hand-Objekt-Interaktion in aktuellen MR-Systemen ist völlig unzureichend. Derzeitige Systeme basieren auf handgehaltenen Controllern oder Erfassungsgeräten, die auf Handgesten ohne Kontakt mit realen Objekten beschränkt sind. GreifbAR löst diese Einschränkung, indem es ein Erfassungssystem einführt, das sowohl die vollständige Handhalterung inklusiv Handoberfläche als auch die Objektpose erkennt, wenn Benutzer mit realen Objekten oder Werkzeugen interagieren. Dieses Erfassungssystem wird in einen Mixed-Reality-Trainingssimulator integriert, der in zwei relevanten Anwendungsfällen demonstriert wird: industrielle Montage und Training chirurgischer Fertigkeiten. Die Nutzbarkeit und Anwendbarkeit sowie der Mehrwert für Trainingssituationen werden gründlich durch Benutzerstudien analysiert.

Fördergeber

Bundesministerium für Bildung und Forschung, BMBF

Förderkennzeichen

16SV8732                                                                                                                                                         

Projektlaufzeit

01.10.2021 – 30.09.2023

Verbundkoordination

Deutsches Forschungszentrum für Künstliche Intelligenz GmbH

Projektpartner

  • DFKI – Forschungsbereich Erweiterte Realität
  • NMY – Mixed Reality Communication GmbH
  • Charité – Universitätsmedizin Berlin
  • Universität Passau Lehrstuhl für Psychologie mit Schwerpunkt Mensch – Maschine – Interaktion

Fördervolumen

1.179.494 € (gesamt), 523.688 € (DFKI)

Kontakt: Dr. Jason Rambach

VIZTA Project Time-of-Flight Camera Datasets Released

As part of the research activities of DFKI Augmented Vision in the VIZTA project (https://www.vizta-ecsel.eu/), two publicly available datasets have been released and are available for download. TIMo dataset is a building indoor monitoring dataset for person detection, person counting, and anomaly detection. TICaM dataset is an automotive in-cabin monitoring dataset with a wide field of view for person detection and segmentation and activity recognition. Real and synthetic images are provided allowing for benchmarking of transfer learning algorithms as well. Both datasets are available here https://vizta-tof.kl.dfki.de/. The publication describing the datasets in detail are available as preprints.

TICaM: https://arxiv.org/pdf/2103.11719.pdf

TIMo: https://arxiv.org/pdf/2108.12196.pdf

Video: https://www.youtube.com/watch?v=xWCor9obttA

Contacts: Dr. Jason Rambach, Dr. Bruno Mirbach

Three papers accepted at ISMAR 2021

This image has an empty alt attribute; its file name is ISMAR2021.png

We are happy to announce that three papers from our department have been accepted at the ISMAR 2021 conference.

ISMAR, the International Symposium on Mixed and Augmented Reality, is the leading international academic conference in the field of Augmented Reality and Mixed Reality. The symposium will be held as a hybrid conference from October 4th to 8th, 2021, with its main location in the city of Bari, Italy.

The accepted papers of our department are the following:

Visual SLAM with Graph-Cut Optimized Multi-Plane Reconstruction
Fangwen Shu, Yaxu Xie, Jason Raphael Rambach, Alain Pagani, Didier Stricker

Comparing Head and AR Glasses Pose Estimation
Ahmet Firintepe, Oussema Dhaouadi, Alain Pagani, Didier Stricker

A Study of Human-Machine Teaming For Single Pilot Operation with Augmented Reality
Nareg Minaskan Karabid, Alain Pagani, Charles-Alban Dormoy, Jean-Marc Andre, Didier Stricker

We congratulate all authors for their publication!

Contact: Dr. Alain Pagani



XR for nature and environment survey

On July 29th, 2021, Dr. Jason Rambach presented the survey paper “A Survey on Applications of Augmented, Mixed and Virtual Reality for Nature and Environment” at the 23rd Human Computer Interaction Conference HCI International. The article is the result of a collaboration between DFKI, the Worms University of Applied Sciences and the University of Kaiserslautern.

Abstract: Augmented, virtual and mixed reality (AR/VR/MR) are technologies of great potential due to the engaging and enriching experiences they are capable of providing. However, the possibilities that AR/VR/MR offer in the area of environmental applications are not yet widely explored. In this paper, we present the outcome of a survey meant to discover and classify existing AR/VR/MR applications that can benefit the environment or increase awareness on environmental issues. We performed an exhaustive search over several online publication access platforms and past proceedings of major conferences in the fields of AR/VR/MR. Identified relevant papers were filtered based on novelty, technical soundness, impact and topic relevance, and classified into different categories. Referring to the selected papers, we discuss how the applications of each category are contributing to environmental protection and awareness. We further analyze these approaches as well as possible future directions in the scope of existing and upcoming AR/VR/MR enabling technologies.

Authors: Jason Rambach, Gergana Lilligreen, Alexander Schäfer, Ramya Bankanal, Alexander Wiebel, Didier Stricker

Paper: https://av.dfki.de/publications/a-survey-on-applications-of-augmented-mixed-and-virtual-reality-for-nature-and-environment/

Contact: Jason.Rambach@dfki.de

Advancing sports analytics to coach athletes through Deep Learning research

The recent advancements in Deep learning has lead to new interesting applications such as analyzing human motion and activities in recorded videos. The analysis covers from simple motion of humans walking, performing exercises to complex motions such as playing sports.

The athlete’s performance can be easily captured with a fixed camera for sports like tennis, badminton, diving, etc. The large availability of low cost cameras in handheld devices has further led to common place solution to record videos and analyze an athletes performance. Although the sports trainers can provide visual feedback by playing recorded videos, it is still hard to measure and monitor the performance improvement of the athlete.  Also, the manual analysis of the obtained footage is a time-consuming task which involves isolating actions of interest and categorizing them using domain-specific knowledge. Thus, the automatic interpretation of performance parameters in sports has gained a keen interest.

Competitive diving is one of the well recognized aquatic sport in Olympics in which a person dives from a platform or a springboard and performs different classes of acrobatics before descending into the water. These classes are standardized by international organization Fédération Internationale de Natation (FINA). The differences in the acrobatics performed in various classes of diving are very subtle. The difference arises in the duration which starts with the diver standing on a diving platform or a springboard and ends at the moment he/she dives into the water. This is a challenging task to model especially due to involvement of rapid changes and requires understanding of long-term human dynamics. Further, the model must be sensitive to subtle changes in body pose over a large number of frames to determine the correct classification.

In order to automate this kind of task, three challenging sub-problems are often encountered:  1) temporally cropping events/actions of interest from continuous video;  2) tracking the person of interest even though other divers and bystanders may be in view; and 3) classifying the events/actions of interest.

We are developing a solution in co-operation with Institut für Angewandte Trainingswissenshaft in Leipzig (IAT) to tackle the three subproblems. We work towards a complete parameter tracking solution based on monocular markerless human body motion tracking using only a mobile device (tablet or mobile phone) as training support tool to the overall diving action analysis. The techniques proposed, can be generalized to video footage recorded from other sports.

Contact person: Dr. Bertram Taetz, Pramod Murthy

Three Papers Accepted at CAIP 2021

We are happy to announce that three papers with respect to our structured light 3D reconstruction pipeline have been accepted for publication at the CAIP 2021. The International Conference on Computer Analysis of Images and Patterns will take place from September 28th to 30th, 2021 as a virtual conference.

The three accepted papers are entitled ”Fast Projector-Driven Structured Light Matching in Sub-Pixel Accuracy using Bilinear Interpolation Assumption”, ”Simultaneous Bi-Directional Structured Light Encoding for Practical Uncalibrated Profilometry” and ”Joint Global ICP for Improved Automatic Alignment of Full Turn Object Scans” and will be available right after the conference.

Authors: Torben Fetzer, Gerd Reis and Didier Stricker

VIZTA Project 24M Review and public summary

DFKI participates in the VIZTA project, coordinated by ST Micrelectronics, aiming  at developing innovative technologies in the field of optical sensors and laser sources for short to long-range 3D-imaging and to demonstrate their value in several key applications including automotive, security, smart buildings, mobile robotics for smart cities, and industry 4.0. The 24-month review by the EU-commission was completed and a public summary of the project was released, including updates from DFKI Augmented Vision on time-of-flight camera dataset recording and deep learning algorithm development for car in-cabin monitoring and smart building person counting and anomaly detection applications.

Please click here to check out the complete summary: https://www.vizta-ecsel.eu/newsletter-april-2021/

Contact: Dr. Jason Rambach, Dr. Bruno Mirbach

Paper Accepted in IEEE Access Journal!

We are happy to announce that our paper “Fast Gravitational Approach for Rigid Point Set Registration With Ordinary Differential Equations” has been accepted for publication in the IEEE Access Journal (Impact Factor: 3.745).

Abstract: This article introduces a new physics-based method for rigid point set alignment called Fast Gravitational Approach (FGA). In FGA, the source and target point sets are interpreted as rigid particle swarms with masses interacting in a globally multiply-linked manner while moving in a simulated gravitational force field. The optimal alignment is obtained by explicit modeling of forces acting on the particles as well as their velocities and displacements with second-order ordinary differential equations of n-body motion. Additional alignment cues can be integrated into FGA through particle masses. We propose a smooth-particle mass function for point mass initialization, which improves robustness to noise and structural discontinuities. To avoid the quadratic complexity of all-to-all point interactions, we adapt a Barnes-Hut tree for accelerated force computation and achieve quasilinear complexity. We show that the new method class has characteristics not found in previous alignment methods such as efficient handling of partial overlaps, inhomogeneous sampling densities, and coping with large point clouds with reduced runtime compared to the state of the art. Experiments show that our method performs on par with or outperforms all compared competing deep-learning-based and general-purpose techniques (which do not take training data) in resolving transformations for LiDAR data and gains state-of-the-art accuracy and speed when coping with different data.

Authors: Sk Aziz Ali, Kerem Kahraman, Christian Theobalt, Didier StrickerVladislav Golyanik

Link to the paper: https://ieeexplore.ieee.org/document/9442679

Paper accepted at ICIP 2021

We are happy to announce that our paper “SEMANTIC SEGMENTATION IN DEPTH DATA : A COMPARATIVE EVALUATION OF IMAGE AND POINT CLOUD BASED METHODS” has been accepted for publication at the ICIP 2021 IEEE International Conference on Image Processing which will take place from September 19th to 22nd, 2021 at Anchorage, Alaska, USA.

Abstract: The problem of semantic segmentation from depth images can be addressed by segmenting directly in the image domain or at 3D point cloud level. In this paper, we attempt for the first time to provide a study and experimental comparison of the two approaches. Through experiments on three datasets, namely SUN RGB-D, NYUdV2 and TICaM, we extensively compare various semantic segmentation algorithms, the input to which includes images and point clouds derived from them. Based on this, we offer analysis of the performance and computational cost of these algorithms that can provide guidelines on when each method should be preferred.

Authors: Jigyasa Katrolia, Lars Krämer, Jason Rambach, Bruno Mirbach, Didier Stricker

Paper: https://av.dfki.de/publications/semantic-segmentation-in-depth-data-a-comparative-evaluation-ofimage-and-point-cloud-based-methods/

Contact: Jigyasa_Singh.Katrolia@dfki.de, Jason.Rambach@dfki.de

Paper Accepted at CVPR 2021 Conference!

We are proud that our paper “RPSRNet: End-to-End Trainable Rigid Point Set Registration Network using Barnes-Hut 2^D-Tree Representation” has been accepted for publication at the Computer Vision Pattern Recognition (CVPR) 2021 Conference, which will take place virtually online from June 19th to 25th. CVPR is the premier annual computer vision conference. Our paper was accepted from ~12000 submissions as one of 23.4% (acceptance rate: 23.4%).

Abstract: We propose RPSRNet – a novel end-to-end trainable deep neural network for rigid point set registration. For this task, we use a novel 2^D-tree representation for the input point sets and a hierarchical deep feature embedding in the neural network. An iterative transformation refinement module of our network boosts the feature matching accuracy in the intermediate stages. We achieve an inference speed of ~12-15$\,$ms to register a pair of input point clouds as large as ~250K. Extensive evaluations on (i) KITTI LiDAR-odometry and (ii) ModelNet-40 datasets show that our method outperforms prior state-of-the-art methods – e.g., on the KITTI dataset, DCP-v2 by 1.3 and 1.5 times, and PointNetLK by 1.8 and 1.9 times better rotational and translational accuracy respectively. Evaluation on ModelNet40 shows that RPSRNet is more robust than other benchmark methods when the samples contain a significant amount of noise and disturbance. RPSRNet accurately registers point clouds with non-uniform sampling densities, e.g., LiDAR data, which cannot be processed by many existing deep-learning-based registration methods.

“Rigid Point Set Registration using Barnes-Hut (BH) 2^D-tree
Representation — The center-of-masses (CoMs) and point-densities of
non-empty tree-nodes are computed for the respective BH-trees of the
source and target. These two attributes are input to our RPSRNet which
predicts rigid transformation from the global feature-embedding of the
tree-nodes.”

Authors: Sk Aziz Ali, Kerem Kahraman, Gerd ReisDidier Stricker

To view the paper, please click here.

DFKI-BMW joint research on Augmented Reality for automotive use cases

In the frame of a research cooperation, DFKI’s Augmented Vision Department and BMW are working jointly on Augmented Reality for In-Car applications. Ahmet Firintepe, a BMW research PhD under the supervision of Dr. Alain Pagani and Prof. Didier Stricker has recently published two papers on outside-in head and glass pose estimation:

Ahmet Firintepe, Alain Pagani and Didier Stricker:
“A Comparison of Single and Multi-View IR image-based AR Glasses Pose Estimation Approaches”
Proc. of the IEEE Virtual Reality conference – Posters. IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) (IEEEVR-2021)

In this paper, we present a study on single and multi-view image-based AR glasses pose estimation with two novel methods. The first approach is named GlassPose and is a VGG-based network. The second approach GlassPoseRN is based on ResNet18. We train and evaluate the two custom developed glasses pose estimation networks with one, two and three input images on the HMDPose dataset. We achieve errors as low as 0.10 degrees and 0.90 mm on average on all axes for orientation and translation. For both networks, we observe minimal improvements in position estimation with more input views.

Ahmet Firintepe, Carolin Vey, Stylianos Asteriadis, Alain Pagani, Didier Stricker:
From IR Images to Point Clouds to Pose: Point Cloud-Based AR Glasses Pose Estimation
In: Journal of Imaging 7 80 Seiten 1-18 MDPI 4/2021.

In this paper, we propose two novel AR glasses pose estimation algorithms from single infrared images by using 3D point clouds as an intermediate representation. Our first approach “PointsToRotation” is based on a Deep Neural Network alone, whereas our second approach “PointsToPose” is a hybrid model combining Deep Learning and a voting-based mechanism. Our methods utilize a point cloud estimator, which we trained on multi-view infrared images in a semisupervised manner, generating point clouds based on one image only. We generate a point cloud dataset with our point cloud estimator using the HMDPose dataset, consisting of multi-view infrared images of various AR glasses with the corresponding 6-DoF poses. In comparison to another point cloud-based 6-DoF pose estimation named CloudPose, we achieve an error reduction of around 50%. Compared to a state-of-the-art image-based method, we reduce the pose estimation error by around 96%.

Paper accepted at MDPI Electronics

Our paper “Controlling Teleportation-Based Locomotion in Virtual Reality with Hand Gestures: A Comparative Evaluation of Two-Handed and One-Handed Techniques” got accepted at MDPI Electronics for a Special Issue on Recent Advances in Virtual Reality and Augmented Reality.

Paper: https://www.mdpi.com/2079-9292/10/6/715 (available as Open Access)
Authors: Alexander SchäferGerd ReisDidier Stricker

Abstract: Virtual Reality (VR) technology offers users the possibility to immerse and freely navigate through virtual worlds. An important component for achieving a high degree of immersion in VR is locomotion. Often discussed in the literature, a natural and effective way of controlling locomotion is still a general problem which needs to be solved. Recently, VR headset manufacturers have been integrating more sensors, allowing hand or eye tracking without any additional required equipment. This enables a wide range of application scenarios with natural freehand interaction techniques where no additional hardware is required. This paper focuses on techniques to control teleportation-based locomotion with hand gestures, where users are able to move around in VR using their hands only. With the help of a comprehensive study involving 21 participants, four different techniques are evaluated. The effectiveness and efficiency as well as user preferences of the presented techniques are determined. Two two-handed and two one-handed techniques are evaluated, revealing that it is possible to move comfortable and effectively through virtual worlds with a single hand only.

TiCAM Dataset for in-Cabin Monitoring released

As part of the research activities of DFKI Augmented Vision in the VIZTA project (https://www.vizta-ecsel.eu/), we have published the open-source dataset for automotive in-cabin monitoring with a wide-angle time-of-flight depth sensor. The TiCAM dataset represents a variety of in-car person behavior scenarios and is annotated with 2D/3D bounding boxes, segmentation masks and person activity labels. The dataset is available here https://vizta-tof.kl.dfki.de/. The publication describing the dataset in detail is available as a preprint here: https://arxiv.org/pdf/2103.11719.pdf

Contacts: Jason Rambach, Jigyasa Katrolia

Paper accepted at ICRA 2021

We are delighted to announce that our paper PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN has been accepted for publication at the ICRA 2021 IEEE International Conference on Robotics and Automation which will take place from May 30 to June 5, 2021 at Xi’an, China.

Abstract: Instance segmentation of planar regions in indoor scenes benefits  visual  SLAM  and  other  applications  such  as augmented reality (AR) where scene understanding is required. Existing  methods  built  upon  two-stage  frameworks  show  satisfactory  accuracy  but  are  limited  by  low  frame  rates.  In this  work,  we  propose  a  real-time  deep  neural  architecture that  estimates  piece-wise  planar  regions  from  a  single  RGB image. Our model employs a variant of a fast single-stage CNN architecture to segment plane instances.  Considering  the  particularity of the target detected, we propose Fast Feature Non-maximum  Suppression  (FF-NMS)  to  reduce  the  suppression errors  resulted  from  overlapping  bounding  boxes  of  planes. We  also  utilize  a  Residual  Feature  Augmentation  module  in the  Feature  Pyramid  Network  (FPN)  .  Our  method  achieves significantly  higher  frame-rates  and  comparable  segmentation accuracy  against  two-stage  methods.  We automatically label over 70,000 images as ground truth from the Stanford 2D-3D-Semantics dataset. Moreover, we incorporate our method with a state-of-the-art planar SLAM and validate  its  benefits.

Authors: Yaxu Xie, Jason Raphael Rambach, Fangwen Shu, Didier Stricker

Paper: https://av.dfki.de/publications/planesegnet-fast-and-robust-plane-estimation-using-a-single-stage-instance-segmentation-cnn/

Contact: Yaxu.Xie@dfki.de, Jason.Rambach@dfki.de