Model generalizability to unseen datasets, concerned with in-the-wild robustness, is less studied for indoor single-image depth prediction. We leverage gradient-based meta-learning for higher generalizability on zero-shot cross-dataset inference. Unlike the most-studied image classification in meta-learning, depth is pixel-level continuous range values, and mappings from each image to depth vary widely across environments. Thus no explicit task boundaries exist. We instead propose fine-grained task that treats each RGB-D pair as a task in our meta-optimization. We first show meta-learning on limited data induces much better prior (max +29.4\%). Using meta-learned weights as initialization for following supervised learning, without involving extra data or information, it consistently outperforms baselines without the method. Compared to most indoor-depth methods that only train/ test on a single dataset, we propose zero-shot cross-dataset protocols, closely evaluate robustness, and show consistently higher generalizability and accuracy by our meta-initialization. The work at the intersection of depth and meta-learning potentially drives both research streams to step closer to practical use.
翻译:模型对未见数据集的泛化能力(涉及野外鲁棒性)在室内单图像深度预测中研究较少。我们利用基于梯度的元学习,在零样本跨数据集推理中实现更高的泛化性。与元学习中研究最多的图像分类不同,深度是像素级的连续范围值,且每个图像到深度的映射在不同环境中差异显著,因此不存在显式的任务边界。我们提出细粒度任务,将每个RGB-D对视为元优化中的一个任务。我们首先证明,在有限数据下进行元学习可产生更优的先验(最大提升+29.4%)。将元学习的权重作为后续监督学习的初始化,无需额外数据或信息,该方法持续优于未使用该方法的基线。与大多数仅在单一数据集上训练/测试的室内深度方法相比,我们提出零样本跨数据集协议,严格评估鲁棒性,并表明我们的元初始化方法具有持续更高的泛化能力和准确性。这项位于深度与元学习交叉领域的工作,有望推动两个研究方向更接近实际应用。