Although deep learning-based segmentation models have achieved impressive performance on public benchmarks, generalizing well to unseen environments remains a major challenge. To improve the model's generalization ability to the new domain during evaluation, the test-time training (TTT) is a challenging paradigm that adapts the source-pretrained model in an online fashion. Early efforts on TTT mainly focus on the image classification task. Directly extending these methods to semantic segmentation easily experiences unstable adaption due to segmentation's inherent characteristics, such as extreme class imbalance and complex decision spaces. To stabilize the adaptation process, we introduce contrastive loss (CL), known for its capability to learn robust and generalized representations. Nevertheless, the traditional CL operates in the representation space and cannot directly enhance predictions. In this paper, we resolve this limitation by adapting the CL to the output space, employing a high temperature, and simplifying the formulation, resulting in a straightforward yet effective loss function called Output Contrastive Loss (OCL). Our comprehensive experiments validate the efficacy of our approach across diverse evaluation scenarios. Notably, our method excels even when applied to models initially pre-trained using domain adaptation methods on test domain data, showcasing its resilience and adaptability.\footnote{Code and more information could be found at~ \url{https://github.com/dazhangyu123/OCL}}
翻译:尽管基于深度学习的分割模型在公共基准上取得了令人瞩目的性能,但在未知环境中实现良好泛化仍是一项重大挑战。为提升模型在评估阶段对新领域的泛化能力,测试时训练(TTT)作为一种在线自适应源预训练模型的范式极具挑战性。早期TTT研究主要聚焦于图像分类任务,直接将这些方法扩展至语义分割时,由于分割任务本身存在极端类别不平衡和复杂决策空间等特性,极易导致自适应过程不稳定。为稳定自适应过程,我们引入对比损失(CL)——该损失以学习鲁棒且通用的表征能力著称。然而传统CL在表征空间中运作,无法直接增强预测结果。本文通过在输出空间适配CL、采用高温系数并简化公式,解决了这一局限性,最终提出简洁高效的输出对比损失(OCL)。全面实验验证了该方法在多种评估场景下的有效性。值得注意的是,即使应用于在测试域数据上使用域自适应方法预训练的模型,我们的方法仍能展现卓越性能,充分证明了其鲁棒性和适应性。\footnote{代码及更多信息见\url{https://github.com/dazhangyu123/OCL}}