Job Categories Thesis

The Augmented Vision Department of DFKI led by Prof. Dr. Didier Stricker is offering a Master Thesis topic for curious and passionate students, who want to develop themselves in advanced Deep Learning and 3D Computer Vision. (More guidance and specifics about the thesis will be discussed in the initial interview)

Main Tasks:

In-depth literature review of recent 6DoF Object Pose Estimation approaches and review of various datasets [2] used in evaluating these methods
Developing a novel approach to Object Pose Estimation [1] using recent Deep Learning models such as Transformers [3]
Extensive experimentation and ablation studies on various Object Pose Datasets and comparison to state-of-the-art approaches
Potentially submitting a concise paper to a Computer Vision conference

Requirements

Good knowledge of concepts in 3D Computer Vision and Deep Learning
High-level experience in Python and PyTorch/TensorFlow
Previous experience in image rendering / dataset creation software like Blender is beneficial

References (Please go through papers [1] and [2] before the interview)

Chen, K., Dou, Q.: Sgpa: Structure-guided prior adaptation for category-level 6d object pose estimation. ICCV 2021
Sundermeyer, M., Hodan, T., Labbé, Y., Wang, G., Brachmann, E., Drost, B., Rother, C., & Matas, J.E.: BOP Challenge 2022 on Detection, Segmentation and Pose Estimation of Specific Rigid Objects. CVPRW 2023
Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I.: Attention is All you Need. NeurIPS 2017

Apply latest by: 31.08.2024

Contact:

Sandeep Inuganti, PhD Student

Augmented Vision, DFKI

Trippstadter Straße 122, 67663 Kaiserslautern

Email: sain01@dfki.de

GANs have made significant strides since their inception in 2014, demonstrating remarkable capabilities in generating realistic audio and video mixes, as well as complex geometries. However, a persistent challenge lies in ensuring the accuracy of the generated results. While the visual appeal of these results may be convincing, they often fail to faithfully represent the genuine geometric properties they aim to emulate. A pertinent example is the simulation of paper folding and crumpling, where maintaining the inherent geometric characteristics of the sheet (e.g., preventing stretching) is crucial. Although deterministic simulations of paper deformation have been developed, comprehending and replicating them often necessitate an extensive understanding of material physics, mathematics, and computer graphics. One potential approach to address this challenge involves harnessing the power of GANs or other network architectures, such as variational autoencoders, to analyze 3D geometries and generate highly accurate representations. However, several hurdles must be overcome, including operating effectively within the 3D space and establishing a robust methodology to evaluate the fidelity of the output geometries. An intriguing application of such methods for generating 3D geometries lies in their ability to rapidly generate synthetic data with exceptional accuracy. This synthesized data can then be employed for training purposes in various domains, offering a valuable resource for enhancing learning algorithms and expanding their applicability.

Tasks

State-of-the-art review on GANs and paper simulation
Generating paper geometries using simulation
Use GANs to generate geometries and designing accuracy and comparison methodology
Evaluating generated results from GANs and possible refinement strategies

Requirements

3D computer vision, Computer graphics, Geometric modelling
Deep Learning (TensorFlow, PyTorch, Keras)
C++ (OpenGL, OpenCV), Python, C#

References

Goodfellow, I., Pouget-Abadie, 2020. Generative adversarial networks. Communications of the ACM, 63(11), pp.139-144.
Narain, R., 2013. Folding and crumpling adaptive sheets. ACM Transactions on Graphics (TOG), 32(4), pp.1-8.
Smith, E.J. ,2017, October. Improved adversarial systems for 3d object generation and reconstruction. In Conference on Robot Learning (pp. 87-96). PMLR.
Xiao, H., 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms.
Das, S., 2019. Dewarpnet: Single-image document unwarping with stacked 3d and 2d regression networks.

Apply here.