The Augmented Vision Department of DFKI led by Prof. Dr. Didier Stricker is offering a Master Thesis topic for curious and passionate students, who want to develop themselves in advanced Deep Learning and 3D Computer Vision. (More guidance and specifics about the thesis will be discussed in the initial interview)

Main Tasks:

  • In-depth literature review of recent 6DoF Object Pose Estimation approaches and review of various datasets [2] used in evaluating these methods
  • Developing a novel approach to Object Pose Estimation [1] using recent Deep Learning models such as Transformers [3]
  • Extensive experimentation and ablation studies on various Object Pose Datasets and comparison to state-of-the-art approaches
  • Potentially submitting a concise paper to a Computer Vision conference

Requirements

  • Good knowledge of concepts in 3D Computer Vision and Deep Learning
  • High-level experience in Python and PyTorch/TensorFlow
  • Previous experience in image rendering / dataset creation software like Blender is beneficial

References (Please go through papers [1] and [2] before the interview)

  1. Chen, K., Dou, Q.: Sgpa: Structure-guided prior adaptation for category-level 6d object pose estimation.  ICCV 2021
  2. Sundermeyer, M., Hodan, T., Labbé, Y., Wang, G., Brachmann, E., Drost, B., Rother, C., & Matas, J.E.: BOP Challenge 2022 on Detection, Segmentation and Pose Estimation of Specific Rigid Objects. CVPRW 2023
  3. Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I.: Attention is All you Need. NeurIPS 2017

 

Apply latest by: 31.08.2024

Contact:

Sandeep Inuganti, PhD Student

Augmented Vision, DFKI

Trippstadter Straße 122, 67663 Kaiserslautern

Email: sain01@dfki.de

Das Deutsche Forschungszentrum für Künstliche Intelligenz (DFKI) ist eines der weltweit größten Forschungsinstitute für Softwaretechnologie auf der Basis von Methoden der Künstlichen Intelligenz (KI). Der Forschungsbereich Augmented Vision in Kaiserslautern unter Leitung von Prof. Dr. Didier Stricker befasst sich im Allgemein mit den Themengebieten Rechnersehen (Computer Vision), Bildverarbeitung, Bildverstehen, Augmented Reality und 3D-Rekonstruktion aus Kamerabildern u.a. mit Ansätzen wie Deep-Learning.

Ihre Aufgaben

Entwicklung und Implementierung fortgeschrittener Algorithmen für Computer Vision und Sprachmodelle, einschließlich Objekterkennung, Bildsegmentierung und Verstehen natürlicher Sprache auch in Zero-Shot-Einstellungen.

Sammeln, Vorverarbeiten und Kommentieren großer Datensätze für das Training und die Validierung von Bildverarbeitungs- und Sprachmodellen.

Deep-Learning-Modelle Training mit Frameworks wie TensorFlow oder PyTorch und optimierung hinsichtlich Leistung, Genauigkeit und Effizienz.

Bewertung der Modellleistung anhand verschiedener Metriken wie Präzision, Recall, BLEU und Perplexität; Durchführung von strenge Tests, um Robustheit und Zuverlässigkeit zu gewährleisten.

Entwicklung von Prototypen und Implementierung von Lösungen in realen Anwendungen, um praktische Anwendbarkeit und Skalierbarkeit zu gewährleisten.

Erstellung und Pflege von Softwaretools und Bibliotheken, die die Forschung und Entwicklung in den Bereichen Computer Vision und Sprachmodelle erleichtern.

Unsere Anforderungen

Master-Abschluss in Informatik oder einem verwandten Fachgebiet.

Fundierte Kenntnisse des maschinellen Lernens, insbesondere des Deep Learning.

Idealerweise Erfahrung mit LLM für Computer Vision.

Gute Programmierkenntnisse in mindestens einer der Programmiersprachen.

Kreativität, Selbstständigkeit, Teamgeist und eine proaktive Einstellung.

Das Deutsche Forschungszentrum für Künstliche Intelligenz GmbH (DFKI) wurde 1988 als gemeinnützige Public-Private-Partnership (PPP) gegründet. Das DFKI verbindet wissenschaftliche Spitzenleistung und wirtschaftsnahe Wertschöpfung mit gesellschaftlicher Wertschätzung. Das DFKI forscht seit über 35 Jahren an KI für den Menschen und orientiert sich an gesellschaftlicher Relevanz und wissenschaftlicher Exzellenz in den entscheidenden zukunftsorientierten Forschungs- und Anwendungsgebieten der Künstlichen Intelligenz. In der internationalen Wissenschaftswelt zählt das DFKI zu den wichtigsten „Centers of Excellence“.

Schwerbehinderte Bewerberinnen und Bewerber und Gleichgestellte werden bei gleicher Eignung besonders berücksichtigt. Das DFKI beabsichtigt, den Anteil von Frauen im Wissenschaftsbereich zu erhöhen und fordert deshalb Frauen ausdrücklich auf, sich zu bewerben.

Researcher(s) (m/f/d) in the field of Computer Vision and Large Language Models

The German Research Centre for Artificial Intelligence (DFKI) is one of the world’s largest research institutes for software technology based on artificial intelligence (AI) methods. The research area Augmented Vision inKaiserslautern, headed by Prof. Dr. Didier Stricker, is generally concerned with the topics of computer vision, image processing, image understanding, augmented reality and 3D reconstruction from camera images, including approaches such as deep learning.

Your tasks

Develop and implement advanced algorithms for computer vision and language models, including object detection, image segmentation, and natural language understanding also in zero-shot setting.

Collect, preprocess, and annotate large datasets for training and validating both computer vision and language models.

Train deep learning models using frameworks such as TensorFlow or PyTorch, optimizing for performance, accuracy, and efficiency.

Evaluate model performance using various metrics like precision, recall, BLEU, and perplexity; conduct rigorous testing to ensure robustness and reliability.

Stay updated with the latest research papers and technological advancements in both computer vision and natural language processing, proposing innovative solutions to complex problems.

Develop prototypes and implement solutions in real-world applications, ensuring practical applicability and scalability.

Create and maintain software tools and libraries that facilitate research and development in computer vision and language models

Your qualifications

Master’s degree in computer science or a related field.

Sound knowledge of machine learning, especially deep learning.

Ideally, experience with LLM for computer vision.

Good programming skills in at least one of the programming languages.

Creativity, independence, team spirit, and a proactive attitude.

The German Research Center for Artificial Intelligence (DFKI) has operated as a non-profit, Public-Private-Partnership (PPP) since 1988. DFKI combines scientific excellence and commercially-oriented value creation with social awareness and is recognized as a major “Center of Excellence” by the international scientific community. In the field of artificial intelligence, DFKI has focused on the goal of human-centric AI for more than 35 years. Research is committed to essential, future-oriented areas of application and socially relevant topics.

DFKI encourages applications from people with disability; DFKI intends to increase the proportion of female employees in the field of science and encourages women to apply for this position.


Apply here.

Creation of Synthetic Datasets using the Unreal Engine

The Augmented Vision (AV) department at DFKI Kaiserslautern is looking
for a research assistant (Hiwi) for creating a synthetic dataset of
spherical stereo images using the Unreal Engine for the task of human
body pose estimation.

Your Task

  • Searching and selecting maps, models and animations
  • Placing avatars in maps and applying animations to them
  • Placing virtual cameras and recording video sequences
  • Recording data on the positions of the cameras and body joints
  • Write new and modify existing code for character placement, animation selection and recording
  • Help recording a small real-world dataset

What We Expect

  • Experience with the Unreal Engine
  • Good programming skills in Python and C++
  • Reliability and a conscientious way of working
  • Good communication skills in English or German

Please send your application (including CV and transcripts) in
electronic form to: Stephan.Krauss@dfki.de

The Augmented Vision Department of DFKI led by Prof. Dr. Didier Stricker, offers a student assistant job (part-time) for curious and passionate students, who want to develop themselves in Advanced Computer Vision.

Your Task

  • Researching and developing techniques for hand pose estimation.
  • Researching and developing techniques for hand-object interaction.
  • Implementing state-of-the-art methods to solve real-world problems such as gesture recognition and hand mesh reconstruction for AR/VR use-cases.

Your Qualifications

  • Good knowledge of Python and Pytorch
  • Interest in Deep Learning and Computer Vision
  • Master or high semester Bachelor

Your Benefits

  • Acquire skills in the domains of hand pose estimation, hand mesh reconstruction and hand-object interaction.
  • Opportunity to produce novel research work in the domain of hand-object interaction.
  • Opportunity to start your thesis with us.
  • Practical experience in modern Deep Learning techniques

Application Process:

  • Solve the tasks specified in the doc and email me the solution.
  • An interview will be scheduled to discuss the solution.
  • If everything goes well, the Hiwi contract starts 1 month (usually) after the confirmation of the job offer.

Apply latest by: 15.09.2024

Please feel free to contact us if you have any questions regarding this position:

Christen.Millerdurai@dfki.de

Room 1.21, DFKI-Kaiserslautern

GANs have made significant strides since their inception in 2014, demonstrating remarkable capabilities in generating realistic audio and video mixes, as well as complex geometries. However, a persistent challenge lies in ensuring the accuracy of the generated results. While the visual appeal of these results may be convincing, they often fail to faithfully represent the genuine geometric properties they aim to emulate. A pertinent example is the simulation of paper folding and crumpling, where maintaining the inherent geometric characteristics of the sheet (e.g., preventing stretching) is crucial. Although deterministic simulations of paper deformation have been developed, comprehending and replicating them often necessitate an extensive understanding of material physics, mathematics, and computer graphics. One potential approach to address this challenge involves harnessing the power of GANs or other network architectures, such as variational autoencoders, to analyze 3D geometries and generate highly accurate representations. However, several hurdles must be overcome, including operating effectively within the 3D space and establishing a robust methodology to evaluate the fidelity of the output geometries. An intriguing application of such methods for generating 3D geometries lies in their ability to rapidly generate synthetic data with exceptional accuracy. This synthesized data can then be employed for training purposes in various domains, offering a valuable resource for enhancing learning algorithms and expanding their applicability.

Tasks

  • State-of-the-art review on GANs and paper simulation
  • Generating paper geometries using simulation
  • Use GANs to generate geometries and designing accuracy and comparison methodology
  • Evaluating generated results from GANs and possible refinement strategies

Requirements

  • 3D computer vision, Computer graphics, Geometric modelling
  • Deep Learning (TensorFlow, PyTorch, Keras)
  • C++ (OpenGL, OpenCV), Python, C#

References

  • Goodfellow, I., Pouget-Abadie, 2020. Generative adversarial networks. Communications of the ACM, 63(11), pp.139-144.
  • Narain, R., 2013. Folding and crumpling adaptive sheets. ACM Transactions on Graphics (TOG), 32(4), pp.1-8.
  • Smith, E.J. ,2017, October. Improved adversarial systems for 3d object generation and reconstruction. In Conference on Robot Learning (pp. 87-96). PMLR.
  • Xiao, H., 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms.
  • Das, S., 2019. Dewarpnet: Single-image document unwarping with stacked 3d and 2d regression networks.


Apply here.