Self-supervised learning in computer vision aims to leverage the inherent structure and relationships within data to learn meaningful representations without explicit human annotation, enabling a holistic understanding of visual scenes. Robustness in vision machine learning ensures reliable and consistent performance, enhancing generalization, adaptability, and resistance to noise, variations, and adversarial attacks. Self-supervised paradigms, namely contrastive learning, knowledge distillation, mutual information maximization, and clustering, have been considered to have shown advances in invariant learning representations. This work investigates the robustness of learned representations of self-supervised learning approaches focusing on distribution shifts and image corruptions in computer vision. Detailed experiments have been conducted to study the robustness of self-supervised learning methods on distribution shifts and image corruptions. The empirical analysis demonstrates a clear relationship between the performance of learned representations within self-supervised paradigms and the severity of distribution shifts and corruptions. Notably, higher levels of shifts and corruptions are found to significantly diminish the robustness of the learned representations. These findings highlight the critical impact of distribution shifts and image corruptions on the performance and resilience of self-supervised learning methods, emphasizing the need for effective strategies to mitigate their adverse effects. The study strongly advocates for future research in the field of self-supervised representation learning to prioritize the key aspects of safety and robustness in order to ensure practical applicability. The source code and results are available on GitHub.
翻译:计算机视觉中的自监督学习旨在利用数据的内在结构和关联来学习有意义的表示,无需人工标注,从而实现对视觉场景的整体理解。视觉机器学习中的鲁棒性则确保模型可靠且性能一致,增强泛化能力、适应性,以及对噪声、变异和对抗攻击的抵抗力。对比学习、知识蒸馏、互信息最大化与聚类等自监督范式,被认为在不变表示学习方面取得了进展。本文聚焦于分布偏移与图像损坏场景,系统研究了计算机视觉中自监督学习方法所习得表示的鲁棒性。通过详尽的实验设计,本文考察了自监督学习方法在分布偏移与图像损坏下的鲁棒表现。实证分析表明,自监督范式下习得表示的性能与分布偏移和损坏的严重程度之间存在明确关联。值得注意的是,较高程度的偏移与损坏会显著削弱习得表示的鲁棒性。这些发现突出了分布偏移与图像损坏对自监督学习方法性能与韧性的关键影响,强调了需制定有效策略以缓解其不利影响。本研究强烈倡导未来自监督表示学习领域的研究应将安全性与鲁棒性作为核心考量,以确保方法的实际应用价值。相关源代码与实验结果已发布于GitHub。