The success of data mixing augmentations in image classification tasks has been well-received. However, these techniques cannot be readily applied to object detection due to challenges such as spatial misalignment, foreground/background distinction, and plurality of instances. To tackle these issues, we first introduce a novel conceptual framework called Supervision Interpolation (SI), which offers a fresh perspective on interpolation-based augmentations by relaxing and generalizing Mixup. Based on SI, we propose LossMix, a simple yet versatile and effective regularization that enhances the performance and robustness of object detectors and more. Our key insight is that we can effectively regularize the training on mixed data by interpolating their loss errors instead of ground truth labels. Empirical results on the PASCAL VOC and MS COCO datasets demonstrate that LossMix can consistently outperform state-of-the-art methods widely adopted for detection. Furthermore, by jointly leveraging LossMix with unsupervised domain adaptation, we successfully improve existing approaches and set a new state of the art for cross-domain object detection.
翻译:数据混合增强在图像分类任务中的成功已得到广泛认可。然而,由于空间错位、前景/背景区分以及实例多样性等挑战,这些技术无法直接应用于目标检测。为解决这些问题,我们首先提出一种名为监督插值(Supervision Interpolation, SI)的新概念框架,通过放宽并推广Mixup,为基于插值的增强方法提供了全新视角。基于SI,我们提出LossMix——一种简洁、通用且有效的正则化方法,能够提升目标检测器等模型的性能与鲁棒性。我们的关键见解在于:通过插值混合数据的损失误差(而非真实标签),即可有效正则化混合数据上的训练过程。在PASCAL VOC和MS COCO数据集上的实验结果表明,LossMix能够持续优于检测领域广泛采用的先进方法。此外,通过将LossMix与无监督域适应联合应用,我们成功改进了现有方法,并在跨域目标检测任务中创下了新的最优水平。