On March 4th, 2021, Dr. Jason Rambach gave a talk on Machine Learning and Computer Vision at the GIZ (Deutsche Gesellschaft für Internationale Zusammenarbeit) workshop on Machine Learning and Computer Vision for Earth Observation organized by the DFKI MLT department. In the talk, the foundations of Computer Vision, Machine Learning and Deep Learning as well as current Research and Implementation challenges were presented.
DFKI participates in the VIZTA project, coordinated by ST Micrelectronics, aiming at developing innovative technologies in the field of optical sensors and laser sources for short to long-range 3D-imaging and to demonstrate their value in several key applications including automotive, security, smart buildings, mobile robotics for smart cities, and industry4.0. The 18-month public summary of the project was released, including updates from DFKI Augmented Vision on time-of-flight camera dataset recording and deep learning algorithm development for car in-cabin monitoring and smart building person counting and anomaly detection applications.
Please click here to check out the complete summary.
We are excited to announce that the Augmented Vision group will present 3 papers in the upcoming VISAPP 2021 Conference, February 8th-10th, 2021:
The International Conference on Computer Vision Theory and Applications (VISAPP) is part of VISIGRAPP, the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. VISAPP aims at becoming a major point of contact between researchers, engineers and practitioners on the area of computer vision application systems. Homepage: http://www.visapp.visigrapp.org/
The 3 accepted papers are:
1. An Adversarial Training based Framework for Depth Domain Adaptation
Jigyasa Singh Katrolia, Lars Krämer, Jason Raphael Rambach, Bruno Mirbach, Didier Stricker
One sentence summary: The paper presents a GAN-based method for domain adaptation between depth images.
2. OFFSED: Off-Road Semantic Segmentation Dataset
Peter Neigel, Jason Raphael Rambach, Didier Stricker
One sentence summary: A dataset for semantic segmentation in off-road scenes for automotive applications is made publically available.
3. SALT: A Semi-automatic Labeling Tool for RGB-D Video Sequences
Dennis Stumpf, Stephan Krauß, Gerd Reis, Oliver Wasenmüller, Didier Stricker
One sentence summary: SALT proposes a simple and effective tool to facilitate the annotation process for segmentation and detection ground truth data in RGB-D video sequences.
We are happy to announce that our paper “SynPo-Net–Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using Synthetic Training” has been accepted for publication at the MDPI Sensors journal, Special Issue Object Tracking and Motion Analysis. Sensors (ISSN 1424-8220; CODEN: SENSC9) is the leading international peer-reviewed open access journal on the science and technology of sensors.
Abstract: Estimation and
tracking of 6DoF poses of objects in images is a challenging problem of great
importance for robotic interaction and augmented reality. Recent approaches
applying deep neural networks for pose estimation have shown encouraging
results. However, most of them rely on training with real images of objects
with severe limitations concerning ground truth pose acquisition, full coverage
of possible poses, and training dataset scaling and generalization capability.
This paper presents a novel approach using a Convolutional Neural Network (CNN)
trained exclusively on single-channel Synthetic images of objects to regress
6DoF object Poses directly (SynPo-Net). The proposed SynPo-Net is a network
architecture specifically designed for pose regression and a proposed domain
adaptation scheme transforming real and synthetic images into an intermediate
domain that is better fit for establishing correspondences. The extensive
evaluation shows that our approach significantly outperforms the
state-of-the-art using synthetic training in terms of both accuracy and speed.
Our system can be used to estimate the 6DoF pose from a single frame, or be
integrated into a tracking system to provide the initial pose.
Authors: Yongzhi Su, Jason Raphael Rambach, Alain Pagani, Didier Stricker
Contact: Yongzhi.Su@dfki.de, Jason.Rambach@dfki.de
We are proud to announce that the Augmented Vision group will present three papers in the upcoming ICPR 2020 conference which will take place from January 10th till 15th, 2021. The International Conference on Pattern Recognition (ICPR) is the premier world conference in Pattern Recognition. It covers both theoretical issues and applications of the discipline. The 25th event in this series is organized as an online virtual conference with more than 1800 participants expected.
The three accepted papers are:
1. HPERL: 3D Human Pose Estimation from RGB and LiDAR
David Michael Fürst, Shriya T. P. Gupta, René Schuster, Oliver Wasenmüller, Didier Stricker
One sentence summary: HPERL proposes a two-stage 3D human pose detector that fuses RGB and LiDAR information for a precise localization in 3D.
Presentation date: PS T3.3, January 12th, 5 pm CET.
2. ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid Networks for Accurate Dense Pixel Matching
Rishav, René Schuster, Ramy Battrawy, Oliver Wasenmüller, Didier Stricker
One sentence summary: ResFPN extends Feature Pyramid Networks by adding residual connections from higher resolution features maps to obtain stronger and better localized features for dense matching with deep neural networks.
This paper is accepted as an oral presentation (best 6% of all submissions).
Presentation date: OS T5.1, January 12th, 2 pm CET; PS T5.1, January 12th, 5 pm CET.
3. Ghost Target Detection in 3D Radar Data using Point Cloud based Deep Neural Network
Mahdi Chamseddine, Jason Rambach, Oliver Wasenmüller, Didier Stricker
One sentence summary: An extension to PointNet is developed and trained to detect ghost targets in 3D radar point clouds using labels by an automatic labelling algorithm.
Presentation date: PS T1.16, January 15th, 4:30 pm CET.
The Winter Conference on Applications of Computer Vision (WACV 2021) is IEEE’s and the PAMI-TC’s premier meeting on applications of computer vision. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers. In 2021, the conference is organized as a virtual online event from January 5th till 9th, 2021.
The four accepted papers are:
1. SSGP: Sparse Spatial Guided Propagation for Robust and Generic Interpolation
René Schuster, Oliver Wasenmüller, Christian Unger, Didier Stricker
Q/A Session: Oral 1B, January 6th, 7 pm CET.
2. A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions
René Schuster, Christian Unger, Didier Stricker
Q/A Session: Oral 1C, January 6th, 7 pm CET.
3. SLAM in the Field: An Evaluation of Monocular Mapping and Localization on Challenging Dynamic Agricultural Environment
Fangwen Shu, Paul Lesur, Yaxu Xie, Alain Pagani, Didier Stricker
Abstract: This paper demonstrates a system capable of combining a sparse, indirect, monocular visual SLAM, with both offline and real-time Multi-View Stereo (MVS) reconstruction algorithms. This combination overcomes many obstacles encountered by autonomous vehicles or robots employed in agricultural environments, such as overly repetitive patterns, need for very detailed reconstructions, and abrupt movements caused by uneven roads. Furthermore, the use of a monocular SLAM makes our system much easier to integrate with an existing device, as we do not rely on a LiDAR (which is expensive and power consuming), or stereo camera (whose calibration is sensitive to external perturbation e.g. camera being displaced). To the best of our knowledge, this paper presents the first evaluation results for monocular SLAM, and our work further explores unsupervised depth estimation on this specific application scenario by simulating RGB-D SLAM to tackle the scale ambiguity, and shows our approach produces econstructions that are helpful to various agricultural tasks. Moreover, we highlight that our experiments provide meaningful insight to improve monocular SLAM systems under agricultural settings.
4. Illumination Normalization by Partially Impossible Encoder-Decoder Cost Function
Steve Dias Da Cruz, Bertram Taetz, Thomas Stifter, Didier Stricker
Abstract: Images recorded during the lifetime of computer vision based systems undergo a wide range of illumination and environmental conditions affecting the reliability of previously trained machine learning models. Image normalization is hence a valuable preprocessing component to enhance the models’ robustness. To this end, we introduce a new strategy for the cost function formulation of encoder-decoder networks to average out all the unimportant information in the input images (e.g. environmental features and illumination changes) to focus on the reconstruction of the salient features (e.g. class instances). Our method exploits the availability of identical sceneries under different illumination and environmental conditions for which we formulate a partially impossible reconstruction target: the input image will not convey enough information to reconstruct the target in its entirety. Its applicability is assessed on three publicly available datasets. We combine the triplet loss as a regularizer in the latent space representation and a nearest neighbour search to improve the generalization to unseen illuminations and class instances. The importance of the aforementioned post-processing is highlighted on an automotive application. To this end, we release a synthetic dataset of sceneries from three different passenger compartments where each scenery is rendered under ten different illumination and environmental conditions: https://sviro.kl.dfki.de
The International Journal of Computer Vision (IJCV) is considered one of the top journals in Computer Vision. It details the science and engineering of this rapidly growing field. Regular articles present major technical advances of broad general interest. Survey articles offer critical reviews of the state of the art and/or tutorial presentations of pertinent topics.
We are proud to announce that our paper “SceneFlowFields++: Multi-frame Matching, Visibility Prediction, and Robust Interpolation for Scene Flow Estimation” has been published in the IJCV (for more information click here). It is an extension of our earlier WACV paper “SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences“.