A typical monocular depth estimator is trained for a single camera, so its performance drops severely on images taken with different cameras. To address this issue, we propose a versatile depth estimator (VDE), composed of a common relative depth estimator (CRDE) and multiple relative-to-metric converters (R2MCs). The CRDE extracts relative depth information, and each R2MC converts the relative information to predict metric depths for a specific camera. The proposed VDE can cope with diverse scenes, including both indoor and outdoor scenes, with only a 1.12\% parameter increase per camera. Experimental results demonstrate that VDE supports multiple cameras effectively and efficiently and also achieves state-of-the-art performance in the conventional single-camera scenario.
翻译:典型的单目深度估计器针对单一相机进行训练,因此在使用不同相机拍摄的图像上其性能会严重下降。为解决这一问题,我们提出了一种多功能深度估计器(VDE),由通用相对深度估计器(CRDE)和多个相对转公制转换器(R2MCs)组成。CRDE提取相对深度信息,每个R2MC将相对信息转换为特定相机的公制深度预测。所提出的VDE能够处理包括室内和室外场景在内的多样化场景,且每个相机仅增加1.12%的参数。实验结果表明,VDE能够高效支持多相机场景,同时在传统单相机场景中也达到了最先进的性能。