Continual learning for visual and multi-modal encoding of human surrounding and behavior
Machine Learning, and in particular Artificial Intelligence (AI) in Deep Learning, has revolutionized Computer Vision in almost all areas. These include topics such as motion estimation, object recognition, semantic segmentation (division and classification of parts of an image), pose estimation of people and hands, and many more. A major problem with this method is the distribution of the data. Training data often differs greatly from real applications and do not adequately cover them. Even if suitable data are available, extensive retraining is time-consuming and costly. Adaptive methods that continuously learn (lifelong learning) are the central challenge for the development of robust, realistic AI applications. In addition to the rich history in the field of general continuous learning, the topic of continuous learning for machine vision under real conditions has recently gained interest. The goal of the DECODE project is to explore continuously adaptive models for reconstructing and understanding human motion and the environment in application-related environments. For this purpose, mobile, visual and inertial sensors (accelerometers and angular rate sensors) will be used. For these different types of sensors and data, different approaches from the field of continuous learning will be researched and developed to ensure a smooth transfer from laboratory conditions to everyday, realistic scenarios. The work will concentrate on in the areas of segmented image and video segmentation, kinematic and pose estimation and the estimation of kinematics and pose of the human body as well as the representation of movements and their context. The field of potential applications for the methods developed in DECODE is wide-ranging and includes detailed ergonomic analysis of human-machine analysis of human-machine interactions, for example in the workplace, in factories, or in vehicles.