Model generalizability to unseen datasets, concerned with in-the-wild robustness, is less studied for indoor single-image depth prediction. We leverage gradient-based meta-learning for higher generalizability on zero-shot cross-dataset inference. Unlike the most-studied image classification in meta-learning, depth is pixel-level continuous range values, and mappings from each image to depth vary widely across environments. Thus no explicit task boundaries exist. We instead propose fine-grained task that treats each RGB-D pair as a task in our meta-optimization. We first show meta-learning on limited data induces much better prior (max +29.4\%). Using meta-learned weights as initialization for following supervised learning, without involving extra data or information, it consistently outperforms baselines without the method. Compared to most indoor-depth methods that only train/ test on a single dataset, we propose zero-shot cross-dataset protocols, closely evaluate robustness, and show consistently higher generalizability and accuracy by our meta-initialization. The work at the intersection of depth and meta-learning potentially drives both research streams to step closer to practical use.
翻译:针对未见数据集(即野外鲁棒性)的模型泛化能力,在室内单图像深度预测领域研究尚不充分。我们利用基于梯度的元学习来提升零样本跨数据集推理的泛化能力。与元学习中最常研究的图像分类不同,深度是像素级的连续范围值,且每幅图像到深度的映射在不同环境间差异显著,因此不存在明确的任务边界。我们提出了一种细粒度任务方法,将每个RGB-D对视为元优化中的一个任务。首先,我们证明在有限数据上使用元学习能够获得更优的先验知识(最大提升+29.4%)。将元学习得到的权重作为后续监督学习的初始化参数,无需引入额外数据或信息,该方法始终优于未使用该方法的基线模型。与大部分仅在单一数据集上训练/测试的室内深度方法相比,我们提出了零样本跨数据集协议,严谨评估了鲁棒性,并证明通过元初始化可获得持续更高的泛化能力和精度。这项位于深度与元学习交叉领域的研究,有望推动两个研究方向更接近实际应用。