deepRGBXYZ: Dense Pixel Description Utilizing RGB and Depth with Stacked Dilated Convolutions

deepRGBXYZ: Dense Pixel Description Utilizing RGB and Depth with Stacked Dilated Convolutions
Ramy Battrawy, René Schuster, Oliver Wasenmüller, Qing Rao, Didier Stricker
International Conference on Intelligent Transportation Systems. IEEE Intelligent Transportation Systems Conference (IEEE ITSC-2020) September 20-23 Rhodes Greece IEEE 9/2020 .

Abstract:
In this paper, we propose deepRGBXYZ – a feature descriptor to represent pixels for robust dense pixel matching. To this end, we concatenate RGB image (appearance) information with the depth (geometric) information represented as XYZ in order to build a robust descriptor which is more invariant to photometric and geometric changes. Both information (RGB and depth) are embedded as an early fusion into one neural network which is based on stacked dilated convolutions for enlarging the receptive field. We alleviate the limitations of image-only descriptors especially within ill-conditioned light regions or textureless objects. Additionally, we overcome the difficulty of using depth-only information which show less descriptive details compared to image-only. We demonstrate the superior accuracy of our deepRGBXYZ descriptor against the state-of-the-art image-only descriptors and we verify our design decision. In addition, we investigate the superior robustness of our deepRGBXYZ descriptor by bringing it into the application of optical flow and scene flow estimation on the established data sets KITTI and FlyingThings3D.