Holistic scene understanding is pivotal for the performance of autonomous machines. In this paper we propose a new end-to-end model for performing semantic segmentation and depth completion jointly. The vast majority of recent approaches have developed semantic segmentation and depth completion as independent tasks. Our approach relies on RGB and sparse depth as inputs to our model and produces a dense depth map and the corresponding semantic segmentation image. It consists of a feature extractor, a depth completion branch, a semantic segmentation branch and a joint branch which further processes semantic and depth information altogether. The experiments done on Virtual KITTI 2 dataset, demonstrate and provide further evidence, that combining both tasks, semantic segmentation and depth completion, in a multi-task network can effectively improve the performance of each task. Code is available at https://github.com/juanb09111/semantic depth.
翻译:整体场景理解对于自主机器的性能至关重要。本文提出了一种新的端到端模型,用于联合执行语义分割与深度补全任务。近年来绝大多数方法将语义分割与深度补全作为独立任务进行开发。本方法以RGB图像与稀疏深度作为模型输入,生成密集深度图及对应语义分割图像。该模型由特征提取器、深度补充分支、语义分割分支及联合分支构成,联合分支进一步协同处理语义与深度信息。在Virtual KITTI 2数据集上的实验表明并进一步证实,将语义分割与深度补全两项任务融合至多任务网络,可有效提升各任务性能。代码开源地址:https://github.com/juanb09111/semantic depth。