MSDE: Multi-Scale Disparity Estimation Model From Stereo Images

MSDE: Multi-Scale Disparity Estimation Model From Stereo Images
Ahmed Alghoul, Mhd Rashed Al Koutayni, Ramy Battrawy, Didier Stricker, Wesam Ashour
In: Journal of Autonomous Intelligence, Vol. 7, Pages 1-15, Frontier Scientific Publishing, 2024.

Abstract:
Most modern stereo matching algorithms predict an accurate disparity map but demand high memory and processing requirements as well as a huge number of floating-point operations. Consequently, their applicability is constrained to high-powered devices with substantial capacities, posing challenges for implementations on low-power devices. To address this problem, we propose MSDE, an efficient end-to-end neural network model designed to strike a balance between estimation accuracy and resource utilization. MSDE is based on hierarchical disparity estimation along with the computation of low-dimensional residual and error cost volumes. To reduce the operations, 3D convolutional layers are factorized into 2D and 1D convolutional layers to improve the efficiency of filtering and the aggregation cost volume features. As a result, the entire model of our MSDE has 48 K parameters, requires 2.5 G floating-point operations (FLOPs), and runs with comparatively small memory footprint of 730 M with an execution time of 29.5 ms for each frame on the RTX 2080Ti GPU. Compared to state-of-the-art methods, our model is more efficient, offers a trade-off between accuracy and efficiency, and it needs low hardware resources.