We are very happy that our paper “CLEO: Continual Learning of Evolving Ontologies” has been accepted to the European Conference on Computer Vision (ECCV). A preprint of our work can be found here. A short video is available on YouTube. The paper is a joint work with ZF Friedrichshafen AG and describes some results of our collaboration.
News
We are proud to announce that our paper “RMS-FlowNet++: Efficient and Robust Multi-scale Scene Flow Estimation for Large-Scale Point Clouds” by Ramy Battrawy, René Schuster, and Didier Stricker has been published in the International Journal of Computer Vision (IJCV). The online version of the paper can be found here a preprint is available here. The paper extends our previous work on efficient scene flow estimation in dense point clouds that has been published at ICRA 2022.
The 6th workshop on Safe Artificial Intelligence for All Domains was held this year in conjunction with CVPR in Seattle. René Schuster is a permanent member of the program committee since 2020.
Congratulations to Ramy Battrawy for his best poster Award at BMVC 2023, https://bmvc2023.org, for his paper.
“EgoFlowNet: Non-Rigid Scene Flow from Point Clouds with Ego-Motion Support”
Ramy Battrawy (DFKI),* René Schuster (DFKI), Didier Stricker (DFKI), BMVC 2023
Please check the paper and the video under: https://proceedings.bmvc2023.org/441/
On March 18th, 2022, René Schuster successfully defended his dissertation entitled “Data-driven and Sparse-to-Dense Concepts in Scene Flow Estimation for Automotive Applications”. The reviewers were Prof. Dr. Didier Stricker (Technical University of Kaiserslautern) and Prof. Dr. Andrés Bruhn (University of Stuttgart). Mr. Schuster received his doctorate from the Department of Computer Science at the Technical University of Kaiserslautern.
In his thesis, Mr. Schuster worked on three-dimensional motion estimation of the dynamic environment of vehicles. The focus was on machine learning methods, and the interpolation of individual estimates into a dense motion field. A particular challenge was the scarcity of annotated data for this problem and use case.
René Schuster received an M. Sc. in computational engineering from Darmstadt University of Technology in 2017. He then moved to DFKI to join the augmented reality group of Prof. Stricker. Much of his research was done in collaborative projects with BMW.
We are happy to announce that our project DECODE has been accepted for the Nvidia Academic Hardware Grant. Nvidia will support our research in the field of human motion estimation and semantic reconstruction by donating a Nvidia A100 GPU for data centers. We will use the new hardware to accelerate our experiments for continual learning.
We are happy to announce that our paper “Multi-scale Iterative Residuals for Fast and Scalable Stereo Matching” has been accepted to the CSCS 2021!
The Computer Science in Cars Symposium (CSCS) is ACM’s flagship event in the field of Car IT. The goal is to bring together scientists, engineers, business representatives, and anyone who shares a passion for solving the myriad of complex problems in vehicle technology and their application to automation, driver and vehicle safety, and driving system safety.
In our work, we place stereo matching in a coarse-to-fine estimation framework to improve runtime and memory requirements while maintaining accuracy. This multiscale framework is tested for two state-of-the-art stereo networks and shows significant improvements in runtime, computational complexity, and memory requirements.
Link to preprint: https://arxiv.org/abs/2110.12769
Title: Multi-scale Iterative Residuals for Fast and Scalable Stereo Matching
Authors: Kumail Raza, René Schuster, Didier Stricker
KI zur Erkennung menschlicher Bewegungen und des Umfeldes
Adaptive Methoden die kontinuierlich dazu lernen (Lebenslanges Lernen), bilden eine zentrale Herausforderung zur Entwicklung von robusten, realitätsnahen KI-Anwendungen. Neben der reichen Historie auf dem Gebiet des allgemeinen kontinuierlichen Lernens („Continual Learning“) hat auch das Themenfeld von kontinuierlichem Lernen für Machinelles Sehen unter Realbedingungen jüngst an Interesse gewonnen.
Ziel des Projektes DECODE ist die Erforschung von kontinuierlich adaptierfähigen Modellen zur Rekonstruktion und dem Verständnis von menschlicher Bewegung und des Umfeldes in anwendungsbezogenen Umgebungen. Dazu sollen mobile, visuelle und inertiale Sensoren (Beschleunigungs- und Drehratensensoren) verwendet werden. Für diese verschiedenen Typen an Sensoren und Daten sollen unterschiedliche Ansätze aus dem Bereich des Continual Learnings erforscht und entwickelt werden um einen problemlosen Transfer von Laborbedingungen zu alltäglichen, realistischen Szenarien zu gewährleisten. Dabei konzentrieren sich die Arbeiten auf die Verbesserung in den Bereichen der semantischen Segmentierung von Bildern und Videos, der Schätzung von Kinematik und Pose des menschlichen Körpers sowie der Repräsentation von Bewegungen und deren Kontext. Das Feld potentieller Anwendungsgebiete für die in DECODE entwickelten Methoden ist weitreichend und umfasst eine detaillierte ergonomische Analyse von Mensch-Maschine Interaktionen zum Beispiel am Arbeitsplatz, in Fabriken, oder in Fahrzeugen.
Weitere Informationen: https://www.dfki.de/web/forschung/projekte-publikationen/projekte-uebersicht/projekt/decode
Contact: René Schuster
We are proud to announce that the Augmented Vision group will present three papers in the upcoming ICPR 2020 conference which will take place from January 10th till 15th, 2021. The International Conference on Pattern Recognition (ICPR) is the premier world conference in Pattern Recognition. It covers both theoretical issues and applications of the discipline. The 25th event in this series is organized as an online virtual conference with more than 1800 participants expected.
The three accepted papers are:
1. HPERL: 3D Human Pose Estimation from RGB and LiDAR
David Michael Fürst, Shriya T. P. Gupta, René Schuster, Oliver Wasenmüller, Didier Stricker
One sentence summary: HPERL proposes a two-stage 3D human pose detector that fuses RGB and LiDAR information for a precise localization in 3D.
Presentation date: PS T3.3, January 12th, 5 pm CET.
2. ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid Networks for Accurate Dense Pixel Matching
Rishav, René Schuster, Ramy Battrawy, Oliver Wasenmüller, Didier Stricker
One sentence summary: ResFPN extends Feature Pyramid Networks by adding residual connections from higher resolution features maps to obtain stronger and better localized features for dense matching with deep neural networks.
This paper is accepted as an oral presentation (best 6% of all submissions).
Presentation date: OS T5.1, January 12th, 2 pm CET; PS T5.1, January 12th, 5 pm CET.
3. Ghost Target Detection in 3D Radar Data using Point Cloud based Deep Neural Network
Mahdi Chamseddine, Jason Rambach, Oliver Wasenmüller, Didier Stricker
One sentence summary: An extension to PointNet is developed and trained to detect ghost targets in 3D radar point clouds using labels by an automatic labelling algorithm.
Presentation date: PS T1.16, January 15th, 4:30 pm CET.
The Winter Conference on Applications of Computer Vision (WACV 2021) is IEEE’s and the PAMI-TC’s premier meeting on applications of computer vision. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers. In 2021, the conference is organized as a virtual online event from January 5th till 9th, 2021.
The four accepted papers are:
1. SSGP: Sparse Spatial Guided Propagation for Robust and Generic Interpolation
René Schuster, Oliver Wasenmüller, Christian Unger, Didier Stricker
Q/A Session: Oral 1B, January 6th, 7 pm CET.
2. A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions
René Schuster, Christian Unger, Didier Stricker
Q/A Session: Oral 1C, January 6th, 7 pm CET.
3. SLAM in the Field: An Evaluation of Monocular Mapping and Localization on Challenging Dynamic Agricultural Environment
Fangwen Shu, Paul Lesur, Yaxu Xie, Alain Pagani, Didier Stricker
Abstract: This paper demonstrates a system capable of combining a sparse, indirect, monocular visual SLAM, with both offline and real-time Multi-View Stereo (MVS) reconstruction algorithms. This combination overcomes many obstacles encountered by autonomous vehicles or robots employed in agricultural environments, such as overly repetitive patterns, need for very detailed reconstructions, and abrupt movements caused by uneven roads. Furthermore, the use of a monocular SLAM makes our system much easier to integrate with an existing device, as we do not rely on a LiDAR (which is expensive and power consuming), or stereo camera (whose calibration is sensitive to external perturbation e.g. camera being displaced). To the best of our knowledge, this paper presents the first evaluation results for monocular SLAM, and our work further explores unsupervised depth estimation on this specific application scenario by simulating RGB-D SLAM to tackle the scale ambiguity, and shows our approach produces econstructions that are helpful to various agricultural tasks. Moreover, we highlight that our experiments provide meaningful insight to improve monocular SLAM systems under agricultural settings.
4. Illumination Normalization by Partially Impossible Encoder-Decoder Cost Function
Steve Dias Da Cruz, Bertram Taetz, Thomas Stifter, Didier Stricker
Abstract: Images recorded during the lifetime of computer vision based systems undergo a wide range of illumination and environmental conditions affecting the reliability of previously trained machine learning models. Image normalization is hence a valuable preprocessing component to enhance the models’ robustness. To this end, we introduce a new strategy for the cost function formulation of encoder-decoder networks to average out all the unimportant information in the input images (e.g. environmental features and illumination changes) to focus on the reconstruction of the salient features (e.g. class instances). Our method exploits the availability of identical sceneries under different illumination and environmental conditions for which we formulate a partially impossible reconstruction target: the input image will not convey enough information to reconstruct the target in its entirety. Its applicability is assessed on three publicly available datasets. We combine the triplet loss as a regularizer in the latent space representation and a nearest neighbour search to improve the generalization to unseen illuminations and class instances. The importance of the aforementioned post-processing is highlighted on an automotive application. To this end, we release a synthetic dataset of sceneries from three different passenger compartments where each scenery is rendered under ten different illumination and environmental conditions: https://sviro.kl.dfki.de
The International Journal of Computer Vision (IJCV) is considered one of the top journals in Computer Vision. It details the science and engineering of this rapidly growing field. Regular articles present major technical advances of broad general interest. Survey articles offer critical reviews of the state of the art and/or tutorial presentations of pertinent topics.
We are proud to announce that our paper “SceneFlowFields++: Multi-frame Matching, Visibility Prediction, and Robust Interpolation for Scene Flow Estimation” has been published in the IJCV (for more information click here). It is an extension of our earlier WACV paper “SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences“.