Unsupervised depth completion methods are trained by minimizing sparse depth and image reconstruction error. Block artifacts from resampling, intensity saturation, and occlusions are amongst the many undesirable by-products of common data augmentation schemes that affect image reconstruction quality, and thus the training signal. Hence, typical augmentations on images viewed as essential to training pipelines in other vision tasks have seen limited use beyond small image intensity changes and flipping. The sparse depth modality have seen even less as intensity transformations alter the scale of the 3D scene, and geometric transformations may decimate the sparse points during resampling. We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth completion. This is achieved by reversing, or ``undo"-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame. This enables computing the reconstruction losses using the original images and sparse depth maps, eliminating the pitfalls of naive loss computation on the augmented inputs. This simple yet effective strategy allows us to scale up augmentations to boost performance. We demonstrate our method on indoor (VOID) and outdoor (KITTI) datasets where we improve upon three existing methods by an average of 11.75% across both datasets.
翻译:无监督深度补全方法通过最小化稀疏深度与图像重建误差来训练模型。常见数据增强方案产生的重采样块状伪影、强度饱和及遮挡等问题,会显著影响图像重建质量,进而削弱训练信号的有效性。因此,在其他视觉任务中被视为训练流程关键要素的典型图像增强方法(除微小强度变化和翻转外)在深度补全中应用有限。稀疏深度模态面临的限制更为显著:强度变换会改变三维场景尺度,而几何变换可能在重采样过程中破坏稀疏点的结构。本文提出一种方法,解锁了此前无法用于无监督深度补全的多种几何增强操作。通过将输出深度坐标的几何变换进行逆向"撤销",将深度图扭曲回原始参考坐标系,即可利用原始图像和稀疏深度图计算重建损失,从而规避对增强输入进行朴素损失计算时的固有缺陷。这种简洁有效的策略使我们能够大幅扩展数据增强规模以提升性能。我们在室内(VOID)和室外(KITTI)数据集上验证了该方法,使三种现有方法的平均性能在两个数据集上提升11.75%。