Generation of Human Images with Clothing using Advanced Conditional Generative Adversarial Networks, International Conference on Deep Learning Theory and Applications (DeLTA 2020)

Generation of Human Images with Clothing using Advanced Conditional Generative Adversarial Networks, International Conference on Deep Learning Theory and Applications (DeLTA 2020)
Sheela Raju Kurupathi, Pramod Murthy, Didier Stricker
In: Ana Fred; Kurosh Madani (Hrsg.). Proceedings of the 1st International Conference on Deep Learning Theory and Applications DeLTA 2020. International Conference on Deep Learning Theory and Applications (DeLTA-2020), July 8-10, Pages 30-41, Vol. 1, ISBN 978-989-758-441-1, SciTePress, 2020.

Abstract:
One of the main challenges of human-image generation is generating a person along with pose and clothing details. However, it is still a difficult task due to challenging backgrounds and appearance variance. Recently, various deep learning models like Stacked Hourglass networks, Variational Auto Encoders (VAE), and Generative Adversarial Networks (GANs) have been used to solve this problem. However, still, they do not generalize well to the real-world human-image generation task qualitatively. The main goal is to use the Spectral Normalization (SN) technique for training GAN to synthesize the human-image along with the perfect pose and appearance details of the person. In this paper, we have investigated how Conditional GANs, along with Spectral Normalization (SN), could synthesize the new image of the target person given the image of the person and the target (novel) pose desired. The model uses 2D keypoints to represent human poses. We also use adversarial hinge loss and present an ablation study. The proposed model variants have generated promising results on both the Market-1501 and DeepFashion Datasets. We supported our claims by benchmarking the proposed model with recent state-of-the-art models. Finally, we show how the Spectral Normalization (SN) technique influences the process of human-image synthesis.