基础模型与低成本传感器相遇：用于零样本度量深度估计的视差重标定的测试时适应 (Foundation Models Meet Low-Cost Sensors: Test-Time Adaptation for Rescaling Disparity for Zero-Shot Metric Depth Estimation)

The recent development of foundation models for monocular depth estimation such as Depth Anything paved the way to zero-shot monocular depth estimation. Since it returns an affine-invariant disparity map, the favored technique to recover the metric depth consists in fine-tuning the model. However, this stage is costly to perform because of the training but also due to the creation of the dataset. It must contain images captured by the camera that will be used at test time and the corresponding ground truth. Moreover, the fine-tuning may also degrade the generalizing capacity of the original model. Instead, we propose in this paper a new method to rescale Depth Anything predictions using 3D points provided by low-cost sensors or techniques such as low-resolution LiDAR, stereo camera, structure-from-motion where poses are given by an IMU. Thus, this approach avoids fine-tuning and preserves the generalizing power of the original depth estimation model while being robust to the noise of the sensor or of the depth model. Our experiments highlight improvements relative to other metric depth estimation methods and competitive results compared to fine-tuned approaches. Code available at https://gitlab.ensta.fr/ssh/monocular-depth-rescaling.

翻译：近期如Depth Anything等单目深度估计基础模型的发展为零样本单目深度估计开辟了道路。由于该模型返回仿射不变视差图，恢复度量深度的主流技术通常需要对模型进行微调。然而，这一阶段不仅因训练过程成本高昂，还因数据集的构建而代价巨大——数据集必须包含测试时将使用的相机拍摄的图像及相应的真实深度值。此外，微调也可能削弱原始模型的泛化能力。为此，本文提出一种新方法，利用低成本传感器或技术（如低分辨率激光雷达、立体相机、结合惯性测量单元提供位姿的运动恢复结构技术）提供的三维点云，对Depth Anything的预测结果进行重标定。该方法避免了微调过程，在保持原始深度估计模型泛化能力的同时，对传感器噪声和深度模型噪声均具有鲁棒性。实验结果表明，相较于其他度量深度估计方法，本方法取得了显著改进；与微调方法相比，亦获得了具有竞争力的结果。代码发布于 https://gitlab.ensta.fr/ssh/monocular-depth-rescaling。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日