News

XR for nature and environment survey

On July 29^th, 2021, Dr. Jason Rambach presented the survey paper “A Survey on Applications of Augmented, Mixed and Virtual Reality for Nature and Environment” at the 23^rd Human Computer Interaction Conference HCI International. The article is the result of a collaboration between DFKI, the Worms University of Applied Sciences and the University of Kaiserslautern.

Abstract: Augmented, virtual and mixed reality (AR/VR/MR) are technologies of great potential due to the engaging and enriching experiences they are capable of providing. However, the possibilities that AR/VR/MR offer in the area of environmental applications are not yet widely explored. In this paper, we present the outcome of a survey meant to discover and classify existing AR/VR/MR applications that can benefit the environment or increase awareness on environmental issues. We performed an exhaustive search over several online publication access platforms and past proceedings of major conferences in the fields of AR/VR/MR. Identified relevant papers were filtered based on novelty, technical soundness, impact and topic relevance, and classified into different categories. Referring to the selected papers, we discuss how the applications of each category are contributing to environmental protection and awareness. We further analyze these approaches as well as possible future directions in the scope of existing and upcoming AR/VR/MR enabling technologies.

Authors: Jason Rambach, Gergana Lilligreen, Alexander Schäfer, Ramya Bankanal, Alexander Wiebel, Didier Stricker

Paper: https://av.dfki.de/publications/a-survey-on-applications-of-augmented-mixed-and-virtual-reality-for-nature-and-environment/

Contact: Jason.Rambach@dfki.de

Presseberichte zu unserem Projekt “KI-Rebschnitt”

Wir freuen uns sehr über die Presseberichte zu unserem Projekt “KI-Rebschnitt”. Viel Spaß beim Lesen der Artikel:

https://www.rlp-international.de/blog/innovation-im-weinbau-durch-kuenstliche-intelligenz/

https://add.rlp.de/de/aktuelles/detail/news/News/detail/landwirtschaft-40-1145000-euro-foerderung-fuer-projekt-zur-kuenstlichen-intelligenz-beim-rebschnit/

https://www.dwm-aktuell.de/foerderbescheid-projekt-ki-rebschnitt

https://www.volksfreund.de/region/mosel-wittlich-hunsrueck/mit-der-datenbrille-in-den-wingert-wie-kuenstliche-intelligenz-den-winzern-helfen-kann_aid-59605207

https://www.ecobeach.de/rheinland-pfalz-oekolandbau-staerken-innovationen-foerdern/

Advancing sports analytics to coach athletes through Deep Learning research

The recent advancements in Deep learning has lead to new interesting applications such as analyzing human motion and activities in recorded videos. The analysis covers from simple motion of humans walking, performing exercises to complex motions such as playing sports.

The athlete’s performance can be easily captured with a fixed camera for sports like tennis, badminton, diving, etc. The large availability of low cost cameras in handheld devices has further led to common place solution to record videos and analyze an athletes performance. Although the sports trainers can provide visual feedback by playing recorded videos, it is still hard to measure and monitor the performance improvement of the athlete. Also, the manual analysis of the obtained footage is a time-consuming task which involves isolating actions of interest and categorizing them using domain-specific knowledge. Thus, the automatic interpretation of performance parameters in sports has gained a keen interest.

Competitive diving is one of the well recognized aquatic sport in Olympics in which a person dives from a platform or a springboard and performs different classes of acrobatics before descending into the water. These classes are standardized by international organization Fédération Internationale de Natation (FINA). The differences in the acrobatics performed in various classes of diving are very subtle. The difference arises in the duration which starts with the diver standing on a diving platform or a springboard and ends at the moment he/she dives into the water. This is a challenging task to model especially due to involvement of rapid changes and requires understanding of long-term human dynamics. Further, the model must be sensitive to subtle changes in body pose over a large number of frames to determine the correct classification.

In order to automate this kind of task, three challenging sub-problems are often encountered: 1) temporally cropping events/actions of interest from continuous video; 2) tracking the person of interest even though other divers and bystanders may be in view; and 3) classifying the events/actions of interest.

We are developing a solution in co-operation with Institut für Angewandte Trainingswissenshaft in Leipzig (IAT) to tackle the three subproblems. We work towards a complete parameter tracking solution based on monocular markerless human body motion tracking using only a mobile device (tablet or mobile phone) as training support tool to the overall diving action analysis. The techniques proposed, can be generalized to video footage recorded from other sports.

Contact person: Dr. Bertram Taetz, Pramod Murthy

Three Papers Accepted at CAIP 2021

We are happy to announce that three papers with respect to our structured light 3D reconstruction pipeline have been accepted for publication at the CAIP 2021. The International Conference on Computer Analysis of Images and Patterns will take place from September 28th to 30th, 2021 as a virtual conference.

The three accepted papers are entitled ”Fast Projector-Driven Structured Light Matching in Sub-Pixel Accuracy using Bilinear Interpolation Assumption”, ”Simultaneous Bi-Directional Structured Light Encoding for Practical Uncalibrated Profilometry” and ”Joint Global ICP for Improved Automatic Alignment of Full Turn Object Scans” and will be available right after the conference.

Authors: Torben Fetzer, Gerd Reis and Didier Stricker

VIZTA Project 24M Review and public summary

DFKI participates in the VIZTA project, coordinated by ST Micrelectronics, aiming at developing innovative technologies in the field of optical sensors and laser sources for short to long-range 3D-imaging and to demonstrate their value in several key applications including automotive, security, smart buildings, mobile robotics for smart cities, and industry 4.0. The 24-month review by the EU-commission was completed and a public summary of the project was released, including updates from DFKI Augmented Vision on time-of-flight camera dataset recording and deep learning algorithm development for car in-cabin monitoring and smart building person counting and anomaly detection applications.

Please click here to check out the complete summary: https://www.vizta-ecsel.eu/newsletter-april-2021/

Contact: Dr. Jason Rambach, Dr. Bruno Mirbach

Paper accepted at ICIP 2021

We are happy to announce that our paper “SEMANTIC SEGMENTATION IN DEPTH DATA : A COMPARATIVE EVALUATION OF IMAGE AND POINT CLOUD BASED METHODS” has been accepted for publication at the ICIP 2021 IEEE International Conference on Image Processing which will take place from September 19th to 22nd, 2021 at Anchorage, Alaska, USA.

Abstract: The problem of semantic segmentation from depth images can be addressed by segmenting directly in the image domain or at 3D point cloud level. In this paper, we attempt for the first time to provide a study and experimental comparison of the two approaches. Through experiments on three datasets, namely SUN RGB-D, NYUdV2 and TICaM, we extensively compare various semantic segmentation algorithms, the input to which includes images and point clouds derived from them. Based on this, we offer analysis of the performance and computational cost of these algorithms that can provide guidelines on when each method should be preferred.

Authors: Jigyasa Katrolia, Lars Krämer, Jason Rambach, Bruno Mirbach, Didier Stricker

Paper: https://av.dfki.de/publications/semantic-segmentation-in-depth-data-a-comparative-evaluation-ofimage-and-point-cloud-based-methods/

Contact: Jigyasa_Singh.Katrolia@dfki.de, Jason.Rambach@dfki.de

Paper Accepted at CVPR 2021 Conference!

We are proud that our paper “RPSRNet: End-to-End Trainable Rigid Point Set Registration Network using Barnes-Hut 2^D-Tree Representation” has been accepted for publication at the Computer Vision Pattern Recognition (CVPR) 2021 Conference, which will take place virtually online from June 19th to 25th. CVPR is the premier annual computer vision conference. Our paper was accepted from ~12000 submissions as one of 23.4% (acceptance rate: 23.4%).

Abstract: We propose RPSRNet – a novel end-to-end trainable deep neural network for rigid point set registration. For this task, we use a novel 2^D-tree representation for the input point sets and a hierarchical deep feature embedding in the neural network. An iterative transformation refinement module of our network boosts the feature matching accuracy in the intermediate stages. We achieve an inference speed of ~12-15$\,$ms to register a pair of input point clouds as large as ~250K. Extensive evaluations on (i) KITTI LiDAR-odometry and (ii) ModelNet-40 datasets show that our method outperforms prior state-of-the-art methods – e.g., on the KITTI dataset, DCP-v2 by 1.3 and 1.5 times, and PointNetLK by 1.8 and 1.9 times better rotational and translational accuracy respectively. Evaluation on ModelNet40 shows that RPSRNet is more robust than other benchmark methods when the samples contain a significant amount of noise and disturbance. RPSRNet accurately registers point clouds with non-uniform sampling densities, e.g., LiDAR data, which cannot be processed by many existing deep-learning-based registration methods.

“Rigid Point Set Registration using Barnes-Hut (BH) 2^D-tree
Representation — The center-of-masses (CoMs) and point-densities of
non-empty tree-nodes are computed for the respective BH-trees of the
source and target. These two attributes are input to our RPSRNet which
predicts rigid transformation from the global feature-embedding of the
tree-nodes.”

Authors: Sk Aziz Ali, Kerem Kahraman, Gerd Reis, Didier Stricker

To view the paper, please click here.

Paper Accepted in IEEE Access Journal!

We are happy to announce that our paper “Fast Gravitational Approach for Rigid Point Set Registration With Ordinary Differential Equations” has been accepted for publication in the IEEE Access Journal (Impact Factor: 3.745).

Abstract: This article introduces a new physics-based method for rigid point set alignment called Fast Gravitational Approach (FGA). In FGA, the source and target point sets are interpreted as rigid particle swarms with masses interacting in a globally multiply-linked manner while moving in a simulated gravitational force field. The optimal alignment is obtained by explicit modeling of forces acting on the particles as well as their velocities and displacements with second-order ordinary differential equations of n-body motion. Additional alignment cues can be integrated into FGA through particle masses. We propose a smooth-particle mass function for point mass initialization, which improves robustness to noise and structural discontinuities. To avoid the quadratic complexity of all-to-all point interactions, we adapt a Barnes-Hut tree for accelerated force computation and achieve quasilinear complexity. We show that the new method class has characteristics not found in previous alignment methods such as efficient handling of partial overlaps, inhomogeneous sampling densities, and coping with large point clouds with reduced runtime compared to the state of the art. Experiments show that our method performs on par with or outperforms all compared competing deep-learning-based and general-purpose techniques (which do not take training data) in resolving transformations for LiDAR data and gains state-of-the-art accuracy and speed when coping with different data.

Authors: Sk Aziz Ali, Kerem Kahraman, Christian Theobalt, Didier Stricker, Vladislav Golyanik

Link to the paper: https://ieeexplore.ieee.org/document/9442679

DFKI-BMW joint research on Augmented Reality for automotive use cases

In the frame of a research cooperation, DFKI’s Augmented Vision Department and BMW are working jointly on Augmented Reality for In-Car applications. Ahmet Firintepe, a BMW research PhD under the supervision of Dr. Alain Pagani and Prof. Didier Stricker has recently published two papers on outside-in head and glass pose estimation:

Ahmet Firintepe, Alain Pagani and Didier Stricker:
“A Comparison of Single and Multi-View IR image-based AR Glasses Pose Estimation Approaches”
Proc. of the IEEE Virtual Reality conference – Posters. IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) (IEEEVR-2021)

In this paper, we present a study on single and multi-view image-based AR glasses pose estimation with two novel methods. The first approach is named GlassPose and is a VGG-based network. The second approach GlassPoseRN is based on ResNet18. We train and evaluate the two custom developed glasses pose estimation networks with one, two and three input images on the HMDPose dataset. We achieve errors as low as 0.10 degrees and 0.90 mm on average on all axes for orientation and translation. For both networks, we observe minimal improvements in position estimation with more input views.

Ahmet Firintepe, Carolin Vey, Stylianos Asteriadis, Alain Pagani, Didier Stricker:
“From IR Images to Point Clouds to Pose: Point Cloud-Based AR Glasses Pose Estimation”
In: Journal of Imaging 7 80 Seiten 1-18 MDPI 4/2021.

In this paper, we propose two novel AR glasses pose estimation algorithms from single infrared images by using 3D point clouds as an intermediate representation. Our first approach “PointsToRotation” is based on a Deep Neural Network alone, whereas our second approach “PointsToPose” is a hybrid model combining Deep Learning and a voting-based mechanism. Our methods utilize a point cloud estimator, which we trained on multi-view infrared images in a semisupervised manner, generating point clouds based on one image only. We generate a point cloud dataset with our point cloud estimator using the HMDPose dataset, consisting of multi-view infrared images of various AR glasses with the corresponding 6-DoF poses. In comparison to another point cloud-based 6-DoF pose estimation named CloudPose, we achieve an error reduction of around 50%. Compared to a state-of-the-art image-based method, we reduce the pose estimation error by around 96%.

Paper accepted at MDPI Electronics

Our paper “Controlling Teleportation-Based Locomotion in Virtual Reality with Hand Gestures: A Comparative Evaluation of Two-Handed and One-Handed Techniques” got accepted at MDPI Electronics for a Special Issue on Recent Advances in Virtual Reality and Augmented Reality.

Paper: https://www.mdpi.com/2079-9292/10/6/715 (available as Open Access)
Authors: Alexander Schäfer, Gerd Reis, Didier Stricker

Abstract: Virtual Reality (VR) technology offers users the possibility to immerse and freely navigate through virtual worlds. An important component for achieving a high degree of immersion in VR is locomotion. Often discussed in the literature, a natural and effective way of controlling locomotion is still a general problem which needs to be solved. Recently, VR headset manufacturers have been integrating more sensors, allowing hand or eye tracking without any additional required equipment. This enables a wide range of application scenarios with natural freehand interaction techniques where no additional hardware is required. This paper focuses on techniques to control teleportation-based locomotion with hand gestures, where users are able to move around in VR using their hands only. With the help of a comprehensive study involving 21 participants, four different techniques are evaluated. The effectiveness and efficiency as well as user preferences of the presented techniques are determined. Two two-handed and two one-handed techniques are evaluated, revealing that it is possible to move comfortable and effectively through virtual worlds with a single hand only.

TiCAM Dataset for in-Cabin Monitoring released

As part of the research activities of DFKI Augmented Vision in the VIZTA project (https://www.vizta-ecsel.eu/), we have published the open-source dataset for automotive in-cabin monitoring with a wide-angle time-of-flight depth sensor. The TiCAM dataset represents a variety of in-car person behavior scenarios and is annotated with 2D/3D bounding boxes, segmentation masks and person activity labels. The dataset is available here https://vizta-tof.kl.dfki.de/. The publication describing the dataset in detail is available as a preprint here: https://arxiv.org/pdf/2103.11719.pdf

Contacts: Jason Rambach, Jigyasa Katrolia

Paper accepted at ICRA 2021

We are delighted to announce that our paper “PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN” has been accepted for publication at the ICRA 2021 IEEE International Conference on Robotics and Automation which will take place from May 30 to June 5, 2021 at Xi’an, China.

Abstract: Instance segmentation of planar regions in indoor scenes benefits visual SLAM and other applications such as augmented reality (AR) where scene understanding is required. Existing methods built upon two-stage frameworks show satisfactory accuracy but are limited by low frame rates. In this work, we propose a real-time deep neural architecture that estimates piece-wise planar regions from a single RGB image. Our model employs a variant of a fast single-stage CNN architecture to segment plane instances. Considering the particularity of the target detected, we propose Fast Feature Non-maximum Suppression (FF-NMS) to reduce the suppression errors resulted from overlapping bounding boxes of planes. We also utilize a Residual Feature Augmentation module in the Feature Pyramid Network (FPN) . Our method achieves significantly higher frame-rates and comparable segmentation accuracy against two-stage methods. We automatically label over 70,000 images as ground truth from the Stanford 2D-3D-Semantics dataset. Moreover, we incorporate our method with a state-of-the-art planar SLAM and validate its benefits.

Authors: Yaxu Xie, Jason Raphael Rambach, Fangwen Shu, Didier Stricker

Paper: https://av.dfki.de/publications/planesegnet-fast-and-robust-plane-estimation-using-a-single-stage-instance-segmentation-cnn/

Contact: Yaxu.Xie@dfki.de, Jason.Rambach@dfki.de

Two articles published at IEEE Access journal

We are happy to announce that two of our papers have been accepted and published in the IEEE Access journal. IEEE Access is an award-winning, multidisciplinary, all-electronic archival journal, continuously presenting the results of original research or development across all of IEEE’s fields of interest. The articles are published with open access to all readers. The research is part of the BIONIC project and was funded by the European Commission under the Horizon 2020 Programme Grant Agreement n. 826304.

“Simultaneous End User Calibration of Multiple Magnetic Inertial Measurement Units With Associated Uncertainty”
Published in: IEEE Access (Volume: 9)
Page(s): 26468 – 26483
Date of Publication: 05 February 2021
Electronic ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3057579

“Magnetometer Robust Deep Human Pose Regression With Uncertainty Prediction Using Sparse Body Worn Magnetic Inertial Measurement Units”
Published in: IEEE Access (Volume: 9)
Page(s): 36657 – 36673
Date of Publication: 26 February 2021
Electronic ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3062545

Presentation on Machine Learning and Computer Vision by Dr. Jason Rambach

On March 4th, 2021, Dr. Jason Rambach gave a talk on Machine Learning and Computer Vision at the GIZ (Deutsche Gesellschaft für Internationale Zusammenarbeit) workshop on Machine Learning and Computer Vision for Earth Observation organized by the DFKI MLT department. In the talk, the foundations of Computer Vision, Machine Learning and Deep Learning as well as current Research and Implementation challenges were presented.

Presentation by our senior researcher Dr. Jason Rambach

Agenda of the GIZ workshop on Machine Learning and Computer Vision for Earth Observation

VIZTA project: 18-month public project summary released

DFKI participates in the VIZTA project, coordinated by ST Micrelectronics, aiming at developing innovative technologies in the field of optical sensors and laser sources for short to long-range 3D-imaging and to demonstrate their value in several key applications including automotive, security, smart buildings, mobile robotics for smart cities, and industry4.0. The 18-month public summary of the project was released, including updates from DFKI Augmented Vision on time-of-flight camera dataset recording and deep learning algorithm development for car in-cabin monitoring and smart building person counting and anomaly detection applications.

Please click here to check out the complete summary.

3 Papers accepted at VISAPP 2021

We are excited to announce that the Augmented Vision group will present 3 papers in the upcoming VISAPP 2021 Conference, February 8th-10th, 2021:

The International Conference on Computer Vision Theory and Applications (VISAPP) is part of VISIGRAPP, the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. VISAPP aims at becoming a major point of contact between researchers, engineers and practitioners on the area of computer vision application systems. Homepage: http://www.visapp.visigrapp.org/

The 3 accepted papers are:

1. An Adversarial Training based Framework for Depth Domain Adaptation
Jigyasa Singh Katrolia, Lars Krämer, Jason Raphael Rambach, Bruno Mirbach, Didier Stricker
One sentence summary: The paper presents a GAN-based method for domain adaptation between depth images.

2. OFFSED: Off-Road Semantic Segmentation Dataset
Peter Neigel, Jason Raphael Rambach, Didier Stricker
One sentence summary: A dataset for semantic segmentation in off-road scenes for automotive applications is made publically available.

3. SALT: A Semi-automatic Labeling Tool for RGB-D Video Sequences
Dennis Stumpf, Stephan Krauß, Gerd Reis, Oliver Wasenmüller, Didier Stricker
One sentence summary: SALT proposes a simple and effective tool to facilitate the annotation process for segmentation and detection ground truth data in RGB-D video sequences.

Article at MDPI Sensors journal

We are happy to announce that our paper “SynPo-Net–Accurate and Fast CNN-Based 6DoF Object Pose Estimation Using Synthetic Training” has been accepted for publication at the MDPI Sensors journal, Special Issue Object Tracking and Motion Analysis. Sensors (ISSN 1424-8220; CODEN: SENSC9) is the leading international peer-reviewed open access journal on the science and technology of sensors.

Abstract: Estimation and tracking of 6DoF poses of objects in images is a challenging problem of great importance for robotic interaction and augmented reality. Recent approaches applying deep neural networks for pose estimation have shown encouraging results. However, most of them rely on training with real images of objects with severe limitations concerning ground truth pose acquisition, full coverage of possible poses, and training dataset scaling and generalization capability. This paper presents a novel approach using a Convolutional Neural Network (CNN) trained exclusively on single-channel Synthetic images of objects to regress 6DoF object Poses directly (SynPo-Net). The proposed SynPo-Net is a network architecture specifically designed for pose regression and a proposed domain adaptation scheme transforming real and synthetic images into an intermediate domain that is better fit for establishing correspondences. The extensive evaluation shows that our approach significantly outperforms the state-of-the-art using synthetic training in terms of both accuracy and speed. Our system can be used to estimate the 6DoF pose from a single frame, or be integrated into a tracking system to provide the initial pose.

Authors: Yongzhi Su, Jason Raphael Rambach, Alain Pagani, Didier Stricker

Article: https://av.dfki.de/publications/synpo-net-accurate-and-fast-cnn-based-6dof-object-pose-estimation-using-synthetic-training/

Contact: Yongzhi.Su@dfki.de, Jason.Rambach@dfki.de

Final virtual training workshop for the Erasmus+ project ArInfuse: Exploiting the potential of Augmented Reality & Geospatial Technologies within the utilities sector

After two years of collaborative work, the project ArInfuse is inviting for its final workshop on January 28th.

ARinfuse is an Erasmus+ project that aims to infuse skills in Augmented Reality for geospatial information management in the context of utility underground infrastructures, such as water, sewage, electricity, gas and fiber optics. In this field, there is a real need for an accurate positioning of the underground utilities, to avoid damages to the existing infrastructures. Information communication technologies (ICT), in fusion with global navigation satellite systems (GNSS), GIS and geodatabases and augmented/virtual reality (AR/VR) are able to offer the possibility to convert the geospatial information of the underground utilities into a powerful tool for field workers, engineers and managers.
ARinfuse is mainly addressed to technical professional profiles (future and current) in the utility sector that use, or are planning to use AR technology into practical applications of ordinary management and maintenance of utility networks.

The workshop entitled “Exploiting the potential of Augmented Reality & Geospatial Technologies within the utilities sector” is addressed to engineering students and professionals that are interested in the function, appliance and benefits of AR and geospatial technologies in the utilities sector.

The workshop will also introduce the ARinfuse catalogue of training modules on Augmented Reality and Geoinformatics applied within the utility infrastructure sector.

Registration: https://www.arinfuse.eu/arinfuse-online-workshop-register/
More information: https://www.arinfuse.eu/join-the-final-arinfuse-online-event-training-seminar-thursday-28-01-2021/

Contact persons: Dr. Alain Pagani and Narek Minaskan

Three papers accepted at ICPR 2020

We are proud to announce that the Augmented Vision group will present three papers in the upcoming ICPR 2020 conference which will take place from January 10th till 15th, 2021. The International Conference on Pattern Recognition (ICPR) is the premier world conference in Pattern Recognition. It covers both theoretical issues and applications of the discipline. The 25th event in this series is organized as an online virtual conference with more than 1800 participants expected.

The three accepted papers are:

1. HPERL: 3D Human Pose Estimation from RGB and LiDAR
David Michael Fürst, Shriya T. P. Gupta, René Schuster, Oliver Wasenmüller, Didier Stricker
One sentence summary: HPERL proposes a two-stage 3D human pose detector that fuses RGB and LiDAR information for a precise localization in 3D.
Presentation date: PS T3.3, January 12th, 5 pm CET.

2. ResFPN: Residual Skip Connections in Multi-Resolution Feature Pyramid Networks for Accurate Dense Pixel Matching
Rishav, René Schuster, Ramy Battrawy, Oliver Wasenmüller, Didier Stricker
One sentence summary: ResFPN extends Feature Pyramid Networks by adding residual connections from higher resolution features maps to obtain stronger and better localized features for dense matching with deep neural networks.
This paper is accepted as an oral presentation (best 6% of all submissions).
Presentation date: OS T5.1, January 12th, 2 pm CET; PS T5.1, January 12th, 5 pm CET.

3. Ghost Target Detection in 3D Radar Data using Point Cloud based Deep Neural Network
Mahdi Chamseddine, Jason Rambach, Oliver Wasenmüller, Didier Stricker
One sentence summary: An extension to PointNet is developed and trained to detect ghost targets in 3D radar point clouds using labels by an automatic labelling algorithm.
Presentation date: PS T1.16, January 15th, 4:30 pm CET.

Four papers accepted at WACV 2021

The Winter Conference on Applications of Computer Vision (WACV 2021) is IEEE’s and the PAMI-TC’s premier meeting on applications of computer vision. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers. In 2021, the conference is organized as a virtual online event from January 5th till 9th, 2021.

The four accepted papers are:

1. SSGP: Sparse Spatial Guided Propagation for Robust and Generic Interpolation
René Schuster, Oliver Wasenmüller, Christian Unger, Didier Stricker
Q/A Session: Oral 1B, January 6th, 7 pm CET.

2. A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion Model and Occlusions
René Schuster, Christian Unger, Didier Stricker
Q/A Session: Oral 1C, January 6th, 7 pm CET.

3. SLAM in the Field: An Evaluation of Monocular Mapping and Localization on Challenging Dynamic Agricultural Environment
Fangwen Shu, Paul Lesur, Yaxu Xie, Alain Pagani, Didier Stricker

Abstract: This paper demonstrates a system capable of combining a sparse, indirect, monocular visual SLAM, with both offline and real-time Multi-View Stereo (MVS) reconstruction algorithms. This combination overcomes many obstacles encountered by autonomous vehicles or robots employed in agricultural environments, such as overly repetitive patterns, need for very detailed reconstructions, and abrupt movements caused by uneven roads. Furthermore, the use of a monocular SLAM makes our system much easier to integrate with an existing device, as we do not rely on a LiDAR (which is expensive and power consuming), or stereo camera (whose calibration is sensitive to external perturbation e.g. camera being displaced). To the best of our knowledge, this paper presents the first evaluation results for monocular SLAM, and our work further explores unsupervised depth estimation on this specific application scenario by simulating RGB-D SLAM to tackle the scale ambiguity, and shows our approach produces econstructions that are helpful to various agricultural tasks. Moreover, we highlight that our experiments provide meaningful insight to improve monocular SLAM systems under agricultural settings.

4. Illumination Normalization by Partially Impossible Encoder-Decoder Cost Function
Steve Dias Da Cruz, Bertram Taetz, Thomas Stifter, Didier Stricker

Abstract: Images recorded during the lifetime of computer vision based systems undergo a wide range of illumination and environmental conditions affecting the reliability of previously trained machine learning models. Image normalization is hence a valuable preprocessing component to enhance the models’ robustness. To this end, we introduce a new strategy for the cost function formulation of encoder-decoder networks to average out all the unimportant information in the input images (e.g. environmental features and illumination changes) to focus on the reconstruction of the salient features (e.g. class instances). Our method exploits the availability of identical sceneries under different illumination and environmental conditions for which we formulate a partially impossible reconstruction target: the input image will not convey enough information to reconstruct the target in its entirety. Its applicability is assessed on three publicly available datasets. We combine the triplet loss as a regularizer in the latent space representation and a nearest neighbour search to improve the generalization to unseen illuminations and class instances. The importance of the aforementioned post-processing is highlighted on an automotive application. To this end, we release a synthetic dataset of sceneries from three different passenger compartments where each scenery is rendered under ten different illumination and environmental conditions: https://sviro.kl.dfki.de

Image belongs to paper no. 4.