GreifbAR Projekt – Greifbare Realität – Interaktion mit realen Werkzeugen in Mixed-Reality Welten

Am 01.10.2021 ist das Forschungsprojekt Projekt GreifbAR gestartet unter Leitung des DFKI (Forschungsbereich Erweiterte Realität). Ziel des Projekts GreifbAR ist es, Mixed-Reality Welten (MR), einschließlich virtueller (VR) und erweiterter Realität („Augmented Reality“ – AR), greifbar und fassbar zu machen, indem die Nutzer mit bloßen Händen mit realen und virtuellen Objekten interagieren können. Die Genauigkeit und Geschicklichkeit der Hand ist für die Ausführung präziser Aufgaben in vielen Bereichen von größter Bedeutung, aber die Erfassung der Hand-Objekt-Interaktion in aktuellen MR-Systemen ist völlig unzureichend. Derzeitige Systeme basieren auf handgehaltenen Controllern oder Erfassungsgeräten, die auf Handgesten ohne Kontakt mit realen Objekten beschränkt sind. GreifbAR löst diese Einschränkung, indem es ein Erfassungssystem einführt, das sowohl die vollständige Handhalterung inklusiv Handoberfläche als auch die Objektpose erkennt, wenn Benutzer mit realen Objekten oder Werkzeugen interagieren. Dieses Erfassungssystem wird in einen Mixed-Reality-Trainingssimulator integriert, der in zwei relevanten Anwendungsfällen demonstriert wird: industrielle Montage und Training chirurgischer Fertigkeiten. Die Nutzbarkeit und Anwendbarkeit sowie der Mehrwert für Trainingssituationen werden gründlich durch Benutzerstudien analysiert.

Fördergeber

Bundesministerium für Bildung und Forschung, BMBF

Förderkennzeichen

16SV8732                                                                                                                                                         

Projektlaufzeit

01.10.2021 – 30.09.2023

Verbundkoordination

Deutsches Forschungszentrum für Künstliche Intelligenz GmbH

Projektpartner

  • DFKI – Forschungsbereich Erweiterte Realität
  • NMY – Mixed Reality Communication GmbH
  • Charité – Universitätsmedizin Berlin
  • Universität Passau Lehrstuhl für Psychologie mit Schwerpunkt Mensch – Maschine – Interaktion

Fördervolumen

1.179.494 € (gesamt), 523.688 € (DFKI)

Kontakt: Dr. Jason Rambach

VIZTA Project Time-of-Flight Camera Datasets Released

As part of the research activities of DFKI Augmented Vision in the VIZTA project (https://www.vizta-ecsel.eu/), two publicly available datasets have been released and are available for download. TIMo dataset is a building indoor monitoring dataset for person detection, person counting, and anomaly detection. TICaM dataset is an automotive in-cabin monitoring dataset with a wide field of view for person detection and segmentation and activity recognition. Real and synthetic images are provided allowing for benchmarking of transfer learning algorithms as well. Both datasets are available here https://vizta-tof.kl.dfki.de/. The publication describing the datasets in detail are available as preprints.

TICaM: https://arxiv.org/pdf/2103.11719.pdf

TIMo: https://arxiv.org/pdf/2108.12196.pdf

Video: https://www.youtube.com/watch?v=xWCor9obttA

Contacts: Dr. Jason Rambach, Dr. Bruno Mirbach

Three papers accepted at ISMAR 2021

This image has an empty alt attribute; its file name is ISMAR2021.png

We are happy to announce that three papers from our department have been accepted at the ISMAR 2021 conference.

ISMAR, the International Symposium on Mixed and Augmented Reality, is the leading international academic conference in the field of Augmented Reality and Mixed Reality. The symposium will be held as a hybrid conference from October 4th to 8th, 2021, with its main location in the city of Bari, Italy.

The accepted papers of our department are the following:

Visual SLAM with Graph-Cut Optimized Multi-Plane Reconstruction
Fangwen Shu, Yaxu Xie, Jason Raphael Rambach, Alain Pagani, Didier Stricker

Comparing Head and AR Glasses Pose Estimation
Ahmet Firintepe, Oussema Dhaouadi, Alain Pagani, Didier Stricker

A Study of Human-Machine Teaming For Single Pilot Operation with Augmented Reality
Nareg Minaskan Karabid, Alain Pagani, Charles-Alban Dormoy, Jean-Marc Andre, Didier Stricker

We congratulate all authors for their publication!

Contact: Dr. Alain Pagani



XR for nature and environment survey

On July 29th, 2021, Dr. Jason Rambach presented the survey paper “A Survey on Applications of Augmented, Mixed and Virtual Reality for Nature and Environment” at the 23rd Human Computer Interaction Conference HCI International. The article is the result of a collaboration between DFKI, the Worms University of Applied Sciences and the University of Kaiserslautern.

Abstract: Augmented, virtual and mixed reality (AR/VR/MR) are technologies of great potential due to the engaging and enriching experiences they are capable of providing. However, the possibilities that AR/VR/MR offer in the area of environmental applications are not yet widely explored. In this paper, we present the outcome of a survey meant to discover and classify existing AR/VR/MR applications that can benefit the environment or increase awareness on environmental issues. We performed an exhaustive search over several online publication access platforms and past proceedings of major conferences in the fields of AR/VR/MR. Identified relevant papers were filtered based on novelty, technical soundness, impact and topic relevance, and classified into different categories. Referring to the selected papers, we discuss how the applications of each category are contributing to environmental protection and awareness. We further analyze these approaches as well as possible future directions in the scope of existing and upcoming AR/VR/MR enabling technologies.

Authors: Jason Rambach, Gergana Lilligreen, Alexander Schäfer, Ramya Bankanal, Alexander Wiebel, Didier Stricker

Paper: https://av.dfki.de/publications/a-survey-on-applications-of-augmented-mixed-and-virtual-reality-for-nature-and-environment/

Contact: Jason.Rambach@dfki.de

Presseberichte zu unserem Projekt “KI-Rebschnitt”
Advancing sports analytics to coach athletes through Deep Learning research

The recent advancements in Deep learning has lead to new interesting applications such as analyzing human motion and activities in recorded videos. The analysis covers from simple motion of humans walking, performing exercises to complex motions such as playing sports.

The athlete’s performance can be easily captured with a fixed camera for sports like tennis, badminton, diving, etc. The large availability of low cost cameras in handheld devices has further led to common place solution to record videos and analyze an athletes performance. Although the sports trainers can provide visual feedback by playing recorded videos, it is still hard to measure and monitor the performance improvement of the athlete.  Also, the manual analysis of the obtained footage is a time-consuming task which involves isolating actions of interest and categorizing them using domain-specific knowledge. Thus, the automatic interpretation of performance parameters in sports has gained a keen interest.

Competitive diving is one of the well recognized aquatic sport in Olympics in which a person dives from a platform or a springboard and performs different classes of acrobatics before descending into the water. These classes are standardized by international organization Fédération Internationale de Natation (FINA). The differences in the acrobatics performed in various classes of diving are very subtle. The difference arises in the duration which starts with the diver standing on a diving platform or a springboard and ends at the moment he/she dives into the water. This is a challenging task to model especially due to involvement of rapid changes and requires understanding of long-term human dynamics. Further, the model must be sensitive to subtle changes in body pose over a large number of frames to determine the correct classification.

In order to automate this kind of task, three challenging sub-problems are often encountered:  1) temporally cropping events/actions of interest from continuous video;  2) tracking the person of interest even though other divers and bystanders may be in view; and 3) classifying the events/actions of interest.

We are developing a solution in co-operation with Institut für Angewandte Trainingswissenshaft in Leipzig (IAT) to tackle the three subproblems. We work towards a complete parameter tracking solution based on monocular markerless human body motion tracking using only a mobile device (tablet or mobile phone) as training support tool to the overall diving action analysis. The techniques proposed, can be generalized to video footage recorded from other sports.

Contact person: Dr. Bertram Taetz, Pramod Murthy

Three Papers Accepted at CAIP 2021

We are happy to announce that three papers with respect to our structured light 3D reconstruction pipeline have been accepted for publication at the CAIP 2021. The International Conference on Computer Analysis of Images and Patterns will take place from September 28th to 30th, 2021 as a virtual conference.

The three accepted papers are entitled ”Fast Projector-Driven Structured Light Matching in Sub-Pixel Accuracy using Bilinear Interpolation Assumption”, ”Simultaneous Bi-Directional Structured Light Encoding for Practical Uncalibrated Profilometry” and ”Joint Global ICP for Improved Automatic Alignment of Full Turn Object Scans” and will be available right after the conference.

Authors: Torben Fetzer, Gerd Reis and Didier Stricker

VIZTA Project 24M Review and public summary

DFKI participates in the VIZTA project, coordinated by ST Micrelectronics, aiming  at developing innovative technologies in the field of optical sensors and laser sources for short to long-range 3D-imaging and to demonstrate their value in several key applications including automotive, security, smart buildings, mobile robotics for smart cities, and industry 4.0. The 24-month review by the EU-commission was completed and a public summary of the project was released, including updates from DFKI Augmented Vision on time-of-flight camera dataset recording and deep learning algorithm development for car in-cabin monitoring and smart building person counting and anomaly detection applications.

Please click here to check out the complete summary: https://www.vizta-ecsel.eu/newsletter-april-2021/

Contact: Dr. Jason Rambach, Dr. Bruno Mirbach

Paper accepted at ICIP 2021

We are happy to announce that our paper “SEMANTIC SEGMENTATION IN DEPTH DATA : A COMPARATIVE EVALUATION OF IMAGE AND POINT CLOUD BASED METHODS” has been accepted for publication at the ICIP 2021 IEEE International Conference on Image Processing which will take place from September 19th to 22nd, 2021 at Anchorage, Alaska, USA.

Abstract: The problem of semantic segmentation from depth images can be addressed by segmenting directly in the image domain or at 3D point cloud level. In this paper, we attempt for the first time to provide a study and experimental comparison of the two approaches. Through experiments on three datasets, namely SUN RGB-D, NYUdV2 and TICaM, we extensively compare various semantic segmentation algorithms, the input to which includes images and point clouds derived from them. Based on this, we offer analysis of the performance and computational cost of these algorithms that can provide guidelines on when each method should be preferred.

Authors: Jigyasa Katrolia, Lars Krämer, Jason Rambach, Bruno Mirbach, Didier Stricker

Paper: https://av.dfki.de/publications/semantic-segmentation-in-depth-data-a-comparative-evaluation-ofimage-and-point-cloud-based-methods/

Contact: Jigyasa_Singh.Katrolia@dfki.de, Jason.Rambach@dfki.de

Paper Accepted at CVPR 2021 Conference!

We are proud that our paper “RPSRNet: End-to-End Trainable Rigid Point Set Registration Network using Barnes-Hut 2^D-Tree Representation” has been accepted for publication at the Computer Vision Pattern Recognition (CVPR) 2021 Conference, which will take place virtually online from June 19th to 25th. CVPR is the premier annual computer vision conference. Our paper was accepted from ~12000 submissions as one of 23.4% (acceptance rate: 23.4%).

Abstract: We propose RPSRNet – a novel end-to-end trainable deep neural network for rigid point set registration. For this task, we use a novel 2^D-tree representation for the input point sets and a hierarchical deep feature embedding in the neural network. An iterative transformation refinement module of our network boosts the feature matching accuracy in the intermediate stages. We achieve an inference speed of ~12-15$\,$ms to register a pair of input point clouds as large as ~250K. Extensive evaluations on (i) KITTI LiDAR-odometry and (ii) ModelNet-40 datasets show that our method outperforms prior state-of-the-art methods – e.g., on the KITTI dataset, DCP-v2 by 1.3 and 1.5 times, and PointNetLK by 1.8 and 1.9 times better rotational and translational accuracy respectively. Evaluation on ModelNet40 shows that RPSRNet is more robust than other benchmark methods when the samples contain a significant amount of noise and disturbance. RPSRNet accurately registers point clouds with non-uniform sampling densities, e.g., LiDAR data, which cannot be processed by many existing deep-learning-based registration methods.

“Rigid Point Set Registration using Barnes-Hut (BH) 2^D-tree
Representation — The center-of-masses (CoMs) and point-densities of
non-empty tree-nodes are computed for the respective BH-trees of the
source and target. These two attributes are input to our RPSRNet which
predicts rigid transformation from the global feature-embedding of the
tree-nodes.”

Authors: Sk Aziz Ali, Kerem Kahraman, Gerd ReisDidier Stricker

To view the paper, please click here.