Machine learning algorithms have achieved remarkable success across various disciplines, use cases and applications, under the prevailing assumption that training and test samples are drawn from the same distribution. Consequently, these algorithms struggle and become brittle even when samples in the test distribution start to deviate from the ones observed during training. Domain adaptation and domain generalization have been studied extensively as approaches to address distribution shifts across test and train domains, but each has its limitations. Test-time adaptation, a recently emerging learning paradigm, combines the benefits of domain adaptation and domain generalization by training models only on source data and adapting them to target data during test-time inference. In this survey, we provide a comprehensive and systematic review on test-time adaptation, covering more than 400 recent papers. We structure our review by categorizing existing methods into five distinct categories based on what component of the method is adjusted for test-time adaptation: the model, the inference, the normalization, the sample, or the prompt, providing detailed analysis of each. We further discuss the various preparation and adaptation settings for methods within these categories, offering deeper insights into the effective deployment for the evaluation of distribution shifts and their real-world application in understanding images, video and 3D, as well as modalities beyond vision. We close the survey with an outlook on emerging research opportunities for test-time adaptation.
翻译:机器学习算法在各种学科、用例和应用中取得了显著成功,其前提是训练样本与测试样本来自相同分布。然而,当测试分布中的样本开始偏离训练期间观察到的样本时,这些算法便会表现不佳且变得脆弱。领域适应和领域泛化作为解决训练域与测试域之间分布偏移的方法已被广泛研究,但各自存在局限性。测试时适应作为一种新兴的学习范式,结合了领域适应与领域泛化的优势:它仅使用源数据训练模型,并在测试时推理过程中将其适应于目标数据。本综述对测试时适应进行了全面而系统的回顾,涵盖了超过400篇近期论文。我们通过将现有方法依据其适应调整的组件——模型、推理、归一化、样本或提示——划分为五个不同类别来组织综述,并对每一类别进行详细分析。我们进一步讨论了这些类别下方法的各种准备与适应设置,从而更深入地洞察其在分布偏移评估中的有效部署,以及其在理解图像、视频与三维数据乃至超越视觉模态的现实应用。最后,我们以测试时适应新兴研究机遇的展望作为综述的结尾。