A Concise but Effective Network for Image Guided Depth Completion in Autonomous Driving

Depth completion is a crucial task in autonomous driving, aiming to convert a sparse depth map into a dense depth prediction. Due to its potentially rich semantic information, RGB image is commonly fused to enhance the completion effect. Image-guided depth completion involves three key challenges: 1) how to effectively fuse the two modalities; 2) how to better recover depth information; and 3) how to achieve real-time prediction for practical autonomous driving. To solve the above problems, we propose a concise but effective network, named CENet, to achieve high-performance depth completion with a simple and elegant structure. Firstly, we use a fast guidance module to fuse the two sensor features, utilizing abundant auxiliary features extracted from the color space. Unlike other commonly used complicated guidance modules, our approach is intuitive and low-cost. In addition, we find and analyze the optimization inconsistency problem for observed and unobserved positions, and a decoupled depth prediction head is proposed to alleviate the issue. The proposed decoupled head can better output the depth of valid and invalid positions with very few extra inference time. Based on the simple structure of dual-encoder and single-decoder, our CENet can achieve superior balance between accuracy and efficiency. In the KITTI depth completion benchmark, our CENet attains competitive performance and inference speed compared with the state-of-the-art methods. To validate the generalization of our method, we also evaluate on indoor NYUv2 dataset, and our CENet still achieve impressive results. The code of this work will be available at https://github.com/lmomoy/CENet.

翻译：深度补全是自动驾驶中的关键任务，旨在将稀疏深度图转换为密集深度预测。由于RGB图像具有丰富的潜在语义信息，通常被融合以增强补全效果。图像引导深度补全面临三个核心挑战：1）如何有效融合两种模态；2）如何更好地恢复深度信息；3）如何在实际自动驾驶中实现实时预测。为解决上述问题，我们提出了一种简洁而有效的网络——CENet，通过简单优雅的结构实现高性能深度补全。首先，我们采用快速引导模块融合两种传感器特征，利用从色彩空间中提取的丰富辅助特征。与其他常用的复杂引导模块不同，我们的方法直观且计算成本低。此外，我们发现并分析了观测位置与非观测位置的优化不一致性问题，并提出解耦深度预测头以缓解该问题。该解耦头能够以极少的额外推理时间，更优地输出有效与无效位置的深度。基于双编码器-单解码器的简单结构，我们的CENet在精度与效率之间实现了卓越平衡。在KITTI深度补全基准测试中，CENet与现有最优方法相比，取得了具有竞争力的性能和推理速度。为验证方法的泛化能力，我们在室内NYUv2数据集上进行了评估，CENet仍取得了令人印象深刻的结果。本工作代码将发布于https://github.com/lmomoy/CENet。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日