Our Augmented Vision department is the coordinator of the new large European project “HumanTech”. The Kick-Off meeting was held on July 20th, 2022, at DFKI in Kaiserslautern. Please read the whole article here: Artificial intelligence for a safe and sustainable construction industry (dfki.de)
Please check out the article “Artificial intelligence for a safe and sustainable construction industry (dfki.de)” concerning the new EU project HumanTech which is coordinated by Dr. Jason Rambach, head of the Spatial Sensing and Machine Perception team (Augmented Reality/Augmented Vision department, Prof. Didier Stricker) at the German Research Center for Artificial Intelligence (DFKI) in Kaiserslautern.
DFKI Augmented Vision had a strong presence in the recent CVPR 2022 Conference held on June 19th-23rd, 2022, in New Orleans, USA. The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) is the premier annual computer vision event internationally. Homepage: https://cvpr2022.thecvf.com/ .
Overall, three publications were presented:
1. ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation
Yongzhi Su, Mahdi Saleh, Torben Fetzer, Jason Raphael Rambach, Nassir Navab, Benjamin Busam, Didier Stricker, Federico Tombari
2. SOMSI: Spherical Novel View Synthesis with Soft Occlusion Multi-Sphere Images Tewodros A Habtegebrial, Christiano Gava, Marcel Rogge, Didier Stricker, Varun Jampani
3. Unsupervised Anomaly Detection from Time-of-Flight Depth Images
Pascal Schneider, Jason Rambach, Bruno Mirbach , Didier Stricker
On June 14th, 2022, Dr. Jason Rambach gave a keynote talk in the Computer Vision session of the Franco-German Research and Innovation Network event held at the Inria headquarters in Versailles, Paris, France. In the talk, an overview of the current activities of the Spatial Sensing and Machine Perception team at DFKI Augmented Vision was presented.
On March 18th, 2022, René Schuster successfully defended his dissertation entitled “Data-driven and Sparse-to-Dense Concepts in Scene Flow Estimation for Automotive Applications”. The reviewers were Prof. Dr. Didier Stricker (Technical University of Kaiserslautern) and Prof. Dr. Andrés Bruhn (University of Stuttgart). Mr. Schuster received his doctorate from the Department of Computer Science at the Technical University of Kaiserslautern.
In his thesis, Mr. Schuster worked on three-dimensional motion estimation of the dynamic environment of vehicles. The focus was on machine learning methods, and the interpolation of individual estimates into a dense motion field. A particular challenge was the scarcity of annotated data for this problem and use case.
René Schuster received an M. Sc. in computational engineering from Darmstadt University of Technology in 2017. He then moved to DFKI to join the augmented reality group of Prof. Stricker. Much of his research was done in collaborative projects with BMW.
We are happy to announce that the Augmented Vision group will present two papers in the upcoming CVPR 2022 Conference from June 19th-23rd in New Orleans, USA. The IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) is the premier annual computer vision event internationally. Homepage: https://cvpr2022.thecvf.com/
The two accepted papers are:
- ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation
Yongzhi Su, Mahdi Saleh, Torben Fetzer, Jason Raphael Rambach, Nassir Navab, Benjamin Busam, Didier Stricker, Federico Tombari
Summary: ZebraPose sets a new paradigm on model-based 6DoF object pose estimation by using a binary object surface encoding to train a neural network to predict the locations of model vertices in a coarse to fine manner. ZebraPose shows a major improvement over the state-of-the-art on several datasets of the BOP Object Pose Estimation benchmark.
Contact: Yongzhi Su, Dr. Jason Rambach
- SOMSI: Spherical Novel View Synthesis with Soft Occlusion Multi-Sphere Images
Tewodros A Habtegebrial, Christiano Gava, Marcel Rogge, Didier Stricker, Varun Jampani
Summary: We propose a novel Multi-Sphere Image representation called Soft Occlusion MSI (SOMSI) and efficient rendering technique that produces accurate spherical novel-views from a sparse spherical light-field. SOMSI models appearance features in a smaller set (e.g. 3) of occlusion levels instead of larger number (e.g. 64) of MSI spheres. Experiments on both synthetic and real-world spherical light-fields demonstrate that using SOMSI can provide a good balance between accuracy and run-time. SOMSI view synthesis quality is on-par with state-of-the-art models like NeRF, while being 2 orders of magnitude faster.
For more information, please visit the project page at https://tedyhabtegebrial.github.io/somsi
Contact: Tewodros A Habtegebrial
We are happy to announce that our project DECODE has been accepted for the Nvidia Academic Hardware Grant. Nvidia will support our research in the field of human motion estimation and semantic reconstruction by donating a Nvidia A100 GPU for data centers. We will use the new hardware to accelerate our experiments for continual learning.
We are happy to announce that the Augmented Vision group will present 2 papers in the upcoming BMVC 2021 Conference, 22-25 November, 2021:
The British Machine Vision Conference (BMVC) is the British Machine Vision Association (BMVA) annual conference on machine vision, image processing, and pattern recognition. It is one of the major international conferences on computer vision and related areas held in the UK. With increasing popularity and quality, it has established itself as a prestigious event on the vision calendar. Homepage: https://www.bmvc2021.com/
The 2 accepted papers are:
1. TICaM: A Time-of-flight In-car Cabin Monitoring Dataset
Authors: Jigyasa Singh Katrolia, Ahmed Elsherif, Hartmut Feld, Bruno Mirbach, Jason Raphael Rambach, Didier Stricker
Summary: TICaM is a Time-of-flight In-car Cabin Monitoring dataset for vehicle interior monitoring using a single wide-angle depth camera. The dataset goes beyond currently available in-car cabin datasets in terms of the ambit of labeled classes, recorded scenarios and annotations provided; all at the same time. The dataset is available here: https://vizta-tof.kl.dfki.de/
Video: https://www.youtube.com/watch?v=aqYUY2JzqHU
Contact: Jason Rambach
2. PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image
Authors: Yaxu Xie, Fangwen Shu, Jason Raphael Rambach, Alain Pagani, Didier Stricker
Summary: Piece-wise 3D planar reconstruction provides holistic scene understanding of man-made environments, especially for indoor scenarios. Different from other existing approaches, we start from enforcing cross-task consistency for our multi-task convolutional neural network, PlaneRecNet, which integrates a single-stage instance segmentation network for piece-wise planar segmentation and a depth decoder to reconstruct the scene from a single RGB image.
Preprint: https://www.dfki.de/web/forschung/projekte-publikationen/publikationen-filter/publikation/11908
Contact: Alain Pagani
We are happy to announce that our paper “Multi-scale Iterative Residuals for Fast and Scalable Stereo Matching” has been accepted to the CSCS 2021!
The Computer Science in Cars Symposium (CSCS) is ACM’s flagship event in the field of Car IT. The goal is to bring together scientists, engineers, business representatives, and anyone who shares a passion for solving the myriad of complex problems in vehicle technology and their application to automation, driver and vehicle safety, and driving system safety.
In our work, we place stereo matching in a coarse-to-fine estimation framework to improve runtime and memory requirements while maintaining accuracy. This multiscale framework is tested for two state-of-the-art stereo networks and shows significant improvements in runtime, computational complexity, and memory requirements.
Link to preprint: https://arxiv.org/abs/2110.12769
Title: Multi-scale Iterative Residuals for Fast and Scalable Stereo Matching
Authors: Kumail Raza, René Schuster, Didier Stricker
KI zur Erkennung menschlicher Bewegungen und des Umfeldes
Adaptive Methoden die kontinuierlich dazu lernen (Lebenslanges Lernen), bilden eine zentrale Herausforderung zur Entwicklung von robusten, realitätsnahen KI-Anwendungen. Neben der reichen Historie auf dem Gebiet des allgemeinen kontinuierlichen Lernens („Continual Learning“) hat auch das Themenfeld von kontinuierlichem Lernen für Machinelles Sehen unter Realbedingungen jüngst an Interesse gewonnen.
Ziel des Projektes DECODE ist die Erforschung von kontinuierlich adaptierfähigen Modellen zur Rekonstruktion und dem Verständnis von menschlicher Bewegung und des Umfeldes in anwendungsbezogenen Umgebungen. Dazu sollen mobile, visuelle und inertiale Sensoren (Beschleunigungs- und Drehratensensoren) verwendet werden. Für diese verschiedenen Typen an Sensoren und Daten sollen unterschiedliche Ansätze aus dem Bereich des Continual Learnings erforscht und entwickelt werden um einen problemlosen Transfer von Laborbedingungen zu alltäglichen, realistischen Szenarien zu gewährleisten. Dabei konzentrieren sich die Arbeiten auf die Verbesserung in den Bereichen der semantischen Segmentierung von Bildern und Videos, der Schätzung von Kinematik und Pose des menschlichen Körpers sowie der Repräsentation von Bewegungen und deren Kontext. Das Feld potentieller Anwendungsgebiete für die in DECODE entwickelten Methoden ist weitreichend und umfasst eine detaillierte ergonomische Analyse von Mensch-Maschine Interaktionen zum Beispiel am Arbeitsplatz, in Fabriken, oder in Fahrzeugen.
Weitere Informationen: https://www.dfki.de/web/forschung/projekte-publikationen/projekte-uebersicht/projekt/decode
Contact: René Schuster