Paper accepted at the ICRA conference

We are happy to announce that our paper titled

Structure PLP-SLAM: Efficient Sparse Mapping and Localization using Point, Line and Plane for Monocular, RGB-D and Stereo CamerasFangwen Shu, Jiaxuan Wang, Alain Pagani, Didier Stricker

has been accepted at the IEEE International Conference on Robotics and Automation (ICRA) 2023.

In this paper, we present a visual SLAM system that uses both points and lines for robust camera localization, and
simultaneously performs a piece-wise planar reconstruction (PPR) of the environment to provide a structural map in
real-time. Our proposed SLAM tightly incorporates the semantic and geometric features to boost both frontend pose tracking and backend map optimization.

The ICRA conference takes place this year in London, from May 29th to June 2nd.

Contact: Dr. Alain Pagani

Workshop on AI and Robotics in Construction at ERF 2023

Dr. Jason Rambach, coordinator of the EU Horizon Project HumanTech co-organized a workshop on “AI and Robotics in Construction” at the European Robotics Forum 2023 in Odense, Denmark (March 14th to 16th, 2023) in cooperation with the construction Robotics projects Beeyonders and RobetArme.

From the project HumanTech, Jason Rambach presented an overview of the project objectives as well as insights into the results achieved by Month 9 of the project. Patrick Roth from the partner Implenia, presented the perspective and challenges of the construction industry on the use of Robotics and AI in construction sites, while the project partners Dr. Bharath Sankaran (Naska.AI) and Dr. Gabor Sziebig (SINTEF) participated in a panel session discussing the future of Robotics in construction.

Workshop schedule:

HumanTech project:                                                              

Contact: Dr. Jason Rambach

Dr. Jason Rambach giving his presentation.
Start of the EU project ExtremeXP

The kick-off meeting for the EU project ExtremeXP was held on January 26th and 27th, 2023, in the city of Athens, Greece.

The vision of ExtremeXP “EXPerimentation driven and user eXPerience-oriented analytics for eXtremely Precise outcomes and decisions” is to provide accurate, precise, fit-for-purpose, and trustworthy data-driven insights via evaluating different complex analytics variants, considering end users’ preferences and feedback in an automated way.

Dr. Alain Pagani enlightened the audience on the capabilities of the Augmented Vision department, highlighting the expertise and key strengths it brings to the table in the realm of Augmented Reality and Explainability.

ExtremeXP will provide:

  • Specification and semantics for modelling complex user-experience-driven analytics.
  • Automated and scalable data management for complex analytics workflow.
  • Scenario-driven and opportunistic machine learning to design and develop AutoML mechanisms for performing scenario-based algorithm and model selection considering on-demand user-provided constraints (performance, resources, time, model options).
  • User-experience- and experiment-driven optimization of complex analytics to design the architecture of the framework for experiment-driven optimisation of complex analytics.
  • Transparent decision making with interactive visualisation methods to explore how augmented reality, visual analytics, and other visualisation techniques and their combinations can enhance user experience for different use cases, actors, domains, applications, and problem areas
  • Extreme data access control and knowledge management
  • Test and validation framework and application on different impactful real-life use cases to incorporate the ExtremeXP tools, methods, models, and software into a scalable, usable, and interoperable integrated framework for complex experiment-driven analytics

The project consortium consists of 20 partners, which are:

  1. Athena Research Center (coordinator) [Greece]
  2. Activeeon [France]
  3. Airbus Defense and Spaces SLC [France]
  4. BitSparkles [France]
  5. Bournemouth University [United Kingdom]
  6. CS-Group [France]
  7. Charles University of Prague [Czech Republic]
  8. Deutsches Forschungszentrum für Künstliche Intelligenz [Germany]
  9. Fundacio Privada I2cat, Internet I Innovacio Digital A Catalunya [Spain]
  10. Institute of Communications and Computer Systems [Greece]
  11. IDEKO [Spain]
  12. INTERACTIVE4D [France]
  14. IThinkUPC [Spain]
  15. MOBY X [Cyprus]
  16. SINTEF [Norway]
  17. Technical University of Delft [Netherlands]
  18. University of Ljubljana [Slovenia]
  19. Universitat Politècnica De Catalunya [Spain]
  20. Vrije Universiteit Amsterdam [Netherlands]

Contact persons:

Mohamed Selim

Dr. Alain Pagani

Nareg Minaskan Karabid

Article in IEEE Robotics and Automation Letter (RA-L) journal

We are happy to announce that our article “OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object Detection” was published in the prestigious IEEE Robotics and Automation Letters (RA-L) Journal. The work is a collaboration of DFKI with the TU Munich and Google. The article is openly accessible at:                                                                      

Abstract: Monocular 3D object detection has recently made a significant leap forward thanks to the use of pre-trained depth estimators for pseudo-LiDAR recovery. Yet, such two-stage methods typically suffer from overfitting and are incapable of explicitly encapsulating the geometric relation between depth and object bounding box. To overcome this limitation, we instead propose to jointly estimate dense scene depth with depth-bounding box residuals and object bounding boxes, allowing a two-stream detection of 3D objects that harnesses both geometry and context information. Thereby, the geometry stream combines visible depth and depth-bounding box residuals to recover the object bounding box via explicit occlusion-aware optimization. In addition, a bounding box based geometry projection scheme is employed in an effort to enhance distance perception. The second stream, named as the Context Stream, directly regresses 3D object location and size. This novel two-stream representation enables us to enforce cross-stream consistency terms, which aligns the outputs of both streams, and further improves the overall performance. Extensive experiments on the public benchmark demonstrate that OPA-3D outperforms state-of-the-art methods on the main Car category, whilst keeping a real-time inference speed.

Yongzhi Su, Yan Di, Guangyao Zhai, Fabian Manhardt, Jason Rambach, Benjamin Busam, Didier Stricker and Federico Tombari “OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object Detection.IEEE Robotics and Automation Letters (2023).

Contacts: Yongzhi Su, Dr. Jason Rambach

Radar Driving Activity Dataset (RaDA) Released

DFKI Augmented Vision recently released the first publicly available UWB Radar Driving Activity Dataset (RaDA), consisting of over 10k data samples from 10 different participants annotated with 6 driving activities. The dataset was recorded in the DFKI driving simulator environment. For more information and to download the dataset please check the project website:

The dataset release is accompanied by an article publication at the Sensors journal:

Brishtel, Iuliia, Stephan Krauss, Mahdi Chamseddine, Jason Raphael Rambach, and Didier Stricker. “Driving Activity Recognition Using UWB Radar and Deep Neural Networks.” Sensors 23, no. 2 (2023): 818.

Contacts: Dr. Jason Rambach, Iuliia Brishtel

Two new PhDs

On Thursday, October 27th, 2022, Mohamed Selim successfully defended his PhD thesis entitled “Deep Learning-based Head Orientation and Gender Estimation from Face Image” in front of the the PhD committee consisting of Prof. Dr. Didier Stricker (TU Kaiserslautern), Prof. Dr. Karsten Berns (TU Kaiserslautern), and Prof Dr. Stefan Deßloch (TU Kaiserslautern).

In the thesis, Mohamed Selim studied the problem of gender and head orientation estimation from face images. Machine-based perception can be of great benefit in extracting that underlying information in face images if the problem is properly modeled. In his thesis, novel solutions are provided to the problems of head orientation estimation and gender prediction. Moreover, the effect of facial appearance changes due to head orientation variation has been investigated on gender prediction accuracy. A novel orientation-guided feature maps recalibration method is presented, that significantly increased the accuracy of gender prediction.

Mohamed Selim received his bachelor and master’s degrees in Computer Science and Engineering from the German University in Cairo, Egypt. He joined the Augmented Vision department in October 2012, as a PhD candidate, and later in March 2018 as a researcher working on industrial and EU research projects. His research interests include computer vision, 3D reconstruction, and deep learning.

Mr. Selim after his successful PhD defense

A week later, on Friday, November 4th, 2022, MSc. Ing. Hammad Tanveer Butt also successfully defended his PhD thesis entitled “Improved Sensor Fusion and Deep Learning of 3D Human Pose From Sparse Magnetic Inertial Measurement Units” in front of the PhD committee consisting of Prof. Dr. Didier Stricker (TU Kaiserslautern and DFKI), Prof. Dr. Imran Shafi (National University of Sciences and Technology, Pakistan) and Prof. Dr. Jörg Dörr (TU Kaiserslautern and IESE Fraunhofer).

The goal of the thesis was to obtain a magnetometer robust 3D human body pose from sparse magnetic inertial motion sensors with uncertainty prediction employing Bayesian Deep learning. To this end, a systematic approach was adopted to address all the challenges of inertial motion capture in an end to end manner. First, simultaneous calibration of multiple magnetic inertial sensors was achieved with error mitigation and residual uncertainty learning. Then a magnetometer robust sensor fusion algorithm for 3D orientation was proposed. Adaptive anatomical error correction was used to reduce long term drift in the joint angles.

Also joint angle constraints were learned using a data driven approach while employing swing-twist formulation for 3D joint rotations. Finally, the thesis showed that Bayesian deep learning framework can be used to learn 3D human pose from sparse magnetic inertial sensors while also predicting the uncertainty of pose estimation which is well correlated with actual error and lack of information, particularly when the yaw angle derived from magnetometer is not used. The thesis led to two peer-reviewed contributions in IEEE Access Journal, as well as a best scientific paper award in IntelliSys-2019 Conference held at UK. The conference paper on swing-twist learning of joint constraints presented in Machine Vision Applications (MVA)-2019, Tokyo Japan was later invited by the reviewing committee amongst top-candidates to be published as a journal paper (extended version). A conference paper and a poster by the author were also accepted at FUSION-2019 Conference held at Ottawa, Canada.

MSc. Ing. Hammad Tanveer Butt received his Bachelors in Avionics (1999) and Master degree in Electrical Engineering (2013) from National University of Sciences and Technology (NUST) Pakistan, respectively. From 2016-2021, he worked at the Augmented Vision (AV) group DFKI as a researcher, while pursuing his PhD. His research interests include nano-electronics, MEMS sensors, deep learning/AI and quantum machine learning.

Start of the CORTEX² project

The kick-off meeting of the CORTEX² project has been held at DFKI in Kaiserslautern on September 20th, 2022.

Participants at the kick-off meeting in Kaiserslautern

The mission of CORTEX² “COoperative Real-Time EXperiences with EXtended reality” is to democratize access to the remote collaboration offered by next-generation XR experiences across a wide range of industries and SMEs.

CORTEX2 will provide:

  • Full support for AR experience as an extension of video conferencing systems when using heterogeneous service end devices through a novel Mediation Gateway platform.
  • Resource-efficient teleconferencing tools through innovative transmission methods and automatic summarization of shared long documents.
  • Easy-to-use and powerful XR experiences with instant 3D reconstruction of environments and objects, and simplified use of natural gestures in collaborative meetings.
  • Fusion of vision and audio for multichannel semantic interpretation and enhanced tools such as virtual conversational agents and automatic meeting summarization.
  • Full integration of internet of things (IoT) devices into XR experiences to optimize interaction with running systems and processes.
  • Optimal extension possibilities and broad adoption by delivering the core system with open APIs and launching open calls to enable further technical extensions, more comprehensive use cases, and deeper evaluation and assessment.

Partners of the project are:

  • DFKI – Deutsches Forschungszentrum für Künstliche Intelligenz GmbH Germany
  • LINAGORA – France
  • ALE – Alcatel-Lucent Entreprise International France
  • ICOM – Intracom SA Telecom Solutions Greece
  • AUS – AUSTRALO Alpha Lab MTÜ Estonia
  • F6S – F6S Network Limited Ireland
  • KUL– Katholieke Universiteit Leuven Belgium
  • CEA – Commissariat à l’énergie atomique et aux énergies alternatives France
  • ACT – Actimage GmbH Germany
  • UJI – Universitat Jaume I De Castellon

In addition to the project activities, CORTEX² will invest a total of 4 million Euros in two open calls, which will be aimed at recruiting tech startups/SMEs to co-develop CORTEX2; engaging new use-cases from different domains to demonstrate CORTEX2 replication through specific integration paths; assessing and validating the social impact associated with XR technology adoption in internal and external use cases.

Contact: Dr. Alain Pagani (Coordinator)

HAIKU project takes off!!

The European HAIKU project is taking off! The kick-off meeting took place in Lisbon on September 7th, 2022.

The goal of HAIKU is to develop a human-centric AI by exploring interactive AI prototypes in a variety of aviation contexts. A key challenge HAIKU faces is to develop human-centric digital assistants that will fit the way humans work.

It is essential both for safe operations, and for society in general, that the people who currently keep aviation so safe can work with, train and supervise these AI systems, and that future autonomous AI systems make judgements and decisions that would be acceptable to humans. HAIKU will pave the way for human-centric-AI by developing new AI-based ‘Digital Assistants’, and associated Human-AI Teaming practices, guidance and assurance processes, via the exploration of interactive AI prototypes in a wide range of aviation contexts.

Therefore, HAIKU will:

  • Design and develop a set of AI assistants, demonstrated in the different use cases.
  • Develop a comprehensive Human Factors design guidance and methods capability (‘HF4AI’) on how to develop safe, effective and trustworthy Digital Assistants for Aviation, integrating and expanding on existing state-of-the-art guidance.
  • Conduct controlled experiments with high operational relevance – illustrating the tasks, roles, autonomy and team performance of the Digital Assistant in a range of normal and emergency scenarios.
  • Develop new safety and validation assurance methods for Digital Assistants, to facilitate early integration into aviation systems by aviation stakeholders and regulatory authorities.
  • Deliver guidance on socially acceptable AI in safety critical operations, and for maintaining aviation’s strong safety record.

DFKI participates with two departments: Augmented Vision and Cognitive Assistants

Contact: Dr. Alain Pagani, Narek Minaskan

VIZTA Project successfully concluded after 42 months

The Augmented Vision department of DFKI participated in the VIZTA project, coordinated by ST Microelectronics, aiming at developing innovative technologies in the field of optical sensors and laser sources for short to long-range 3D-imaging and to demonstrate their value in several key applications including automotive, security, smart buildings, mobile robotics for smart cities, and industry4.0.

The final project review was successfully completed in Grenoble, France on November 17th-18th, 2022. The schedule included presentations on the achievements of all partners as well as live demonstrators of the developed technologies. DFKI presented their smart building person detection demonstrator based on a top-down view from a Time-of-flight (ToF) camera, developed in cooperation with the project partner IEE. A second demonstrator, showing an in-cabin monitoring system based on a wide-field-of-view, which is installed in DFKIs lab has been presented in a video.

During VIZTA, several key results were obtained at DFKI on the topics of in-car and smart building monitoring including:

Figure 1: In-car person and object detection (left), and top-down person detection and tracking for smart building applications (right).

Contact: Dr. Jason Rambach, Dr. Bruno Mirbach

DFKI Augmented Vision Researchers win two awards in Object Pose Estimation challenge (BOP Challenge, ECCV 2022)

DFKI Augmented Vision researchers Yongzhi Su, Praveen Nathan and Jason Rambach received their 1st place award in the prestigious BOP Challenge 2022 in the categories Overall Best Segmentation Method and The Best BlenderProc-Trained Segmentation Method.

The BOP benchmark and challenge addresses the problem of 6-degree-of-freedom object pose estimation, which is of great importance for many applications such as robot grasping or augmented reality. This year, the BOP challenge was held within the “Recovering 6D Object Pose” Workshop at the European Conference on Computer Vision (ECCV) in Tel Aviv, Israel . A total award of $4000 was distributed among the winning teams of the BOP challenge, donated by Meta Reality Labs and Niantic.

The awards were received by Dr. Jason Rambach on behalf of the DFKI Team and a short presentation of the method followed. The winning method was based on the CVPR 2022 paper “ZebraPose”  

ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation
Yongzhi Su, Mahdi Saleh, Torben Fetzer, Jason Raphael Rambach, Nassir Navab, Benjamin Busam, Didier Stricker, Federico Tombari

The winning approach was develop by a team led by DFKI AV, with contributing researchers from TU Munich and Zhejiang University.

Contact: Yongzhi Su, Dr. Jason Rambach

Dr. Jason Rambach with the award