This paper focuses on perceiving and navigating 3D environments using echoes and RGB image. In particular, we perform depth estimation by fusing RGB image with echoes, received from multiple orientations. Unlike previous works, we go beyond the field of view of the RGB and estimate dense depth maps for substantially larger parts of the environment. We show that the echoes provide holistic and in-expensive information about the 3D structures complementing the RGB image. Moreover, we study how echoes and the wide field-of-view depth maps can be utilised in robot navigation. We compare the proposed methods against recent baselines using two sets of challenging realistic 3D environments: Replica and Matterport3D. The implementation and pre-trained models will be made publicly available.
翻译:本文聚焦于利用回声与RGB图像感知和导航三维环境。具体而言,我们通过融合来自多个方位的RGB图像与回声进行深度估计。与以往研究不同,我们突破了RGB视野的限制,针对环境中显著更大的区域预测稠密深度图。研究表明,回声能够提供关于三维结构的全局性且低成本的信息,有效补充了RGB图像。此外,我们还探索了如何将回声与宽视野深度图应用于机器人导航。基于两组具有挑战性的真实三维环境(Replica与Matterport3D),我们将所提方法与近期基线模型进行了对比。相关实现代码与预训练模型将公开提供。