The performance of computer vision models is susceptible to unexpected changes in input images when deployed in real scenarios. These changes are referred to as common corruptions. While they can hinder the applicability of computer vision models in real-world scenarios, they are not always considered as a testbed for model generalization and robustness. In this survey, we present a comprehensive and systematic overview of methods that improve corruption robustness of computer vision models. Unlike existing surveys that focus on adversarial attacks and label noise, we cover extensively the study of robustness to common corruptions that can occur when deploying computer vision models to work in practical applications. We describe different types of image corruption and provide the definition of corruption robustness. We then introduce relevant evaluation metrics and benchmark datasets. We categorize methods into four groups. We also cover indirect methods that show improvements in generalization and may improve corruption robustness as a byproduct. We report benchmark results collected from the literature and find that they are not evaluated in a unified manner, making it difficult to compare and analyze. We thus built a unified benchmark framework to obtain directly comparable results on benchmark datasets. Furthermore, we evaluate relevant backbone networks pre-trained on ImageNet using our framework, providing an overview of the base corruption robustness of existing models to help choose appropriate backbones for computer vision tasks. We identify that developing methods to handle a wide range of corruptions and efficiently learn with limited data and computational resources is crucial for future development. Additionally, we highlight the need for further investigation into the relationship among corruption robustness, OOD generalization, and shortcut learning.
翻译:计算机视觉模型在实际场景部署时,其性能容易受到输入图像意外变化的影响。这些变化被称为常见扰动。虽然它们可能阻碍计算机视觉模型在实际场景中的适用性,但并非总是被用作评估模型泛化能力和鲁棒性的测试基准。本综述全面系统地介绍了提升计算机视觉模型扰动鲁棒性的方法。与现有侧重于对抗攻击和标签噪声的综述不同,我们广泛研究了计算机视觉模型在实用部署中可能遇到的常见扰动鲁棒性。我们描述了不同类型的图像扰动,并给出了扰动鲁棒性的定义。随后,我们介绍了相关评估指标和基准数据集,并将方法分为四类。我们还涵盖了那些能提升泛化能力并可能间接提升扰动鲁棒性的间接方法。我们整理了文献中的基准测试结果,发现这些结果缺乏统一的评估标准,导致难以比较和分析。为此,我们构建了统一的基准评估框架,以在基准数据集上获得可直接比较的结果。此外,我们利用该框架评估了在ImageNet上预训练的相关骨干网络,提供了现有模型的基础扰动鲁棒性概览,以帮助为计算机视觉任务选择合适的骨干网络。我们指出,开发能够处理广泛扰动并在有限数据和计算资源下高效学习的方法,对未来发展至关重要。此外,我们强调需要进一步研究扰动鲁棒性、分布外泛化与捷径学习之间的关系。