SynthSL: Expressive Humans for Sign Language Image Synthesis

SynthSL: Expressive Humans for Sign Language Image Synthesis
Jilliam Maria Diaz Barros, Chen-Yu Wang, Jameel Malik, Abdalla Elsayed Abdou Elsayed Mohamed Arafa, Didier Stricker
In: Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition. IEEE International Conference on Automatic Face and Gesture Recognition (FG-2024), May 27-31, Istanbul, Turkey, IEEE, 5/2024.

Abstract:
Around 5% of the world’s population live with disabling hearing loss. Despite recent advancements to improve accessibility to the Deaf community, research on sign language is still limited. In this work, we introduce a large-scale synthetic dataset on sign language, SynthSL, targeted to sign language production, recognition and translation. Using state-of-the-start methods for human body modelling, SynthSL aims to augment current datasets by providing additional ground truth data such as depth and normal maps, rendered models, segmentation masks and 2D/3D body joints. We additionally explore a generative architecture for the synthesis of sign images and propose a new generator based on Swin Transformers, conditioned on given body poses and appearance. We believe that an increase on the publicly available data on sign language would boost research and close the performance gap with related topics on human body synthesis. Our code and dataset are available at https://github.com/jilliam/SynthSL.