Current test-time adaptation (TTA) approaches aim to adapt to environments that change continuously. Yet, it is unclear whether TTA methods can maintain their adaptability over prolonged periods. To answer this question, we introduce a diagnostic setting - **recurring TTA** where environments not only change but also recur over time, creating an extensive data stream. This setting allows us to examine the error accumulation of TTA models, in the most basic scenario, when they are regularly exposed to previous testing environments. Furthermore, we simulate a TTA process on a simple yet representative $\epsilon$-**perturbed Gaussian Mixture Model Classifier**, deriving theoretical insights into the dataset- and algorithm-dependent factors contributing to gradual performance degradation. Our investigation leads us to propose **persistent TTA (PeTTA)**, which senses when the model is diverging towards collapse and adjusts the adaptation strategy, striking a balance between the dual objectives of adaptation and model collapse prevention. The supreme stability of PeTTA over existing approaches, in the face of lifelong TTA scenarios, has been demonstrated over comprehensive experiments on various benchmarks.
翻译:当前测试时适应方法旨在适应持续变化的环境。然而,TTA方法能否在长时间内保持其适应性尚不明确。为回答这一问题,我们引入了一种诊断性设置——**循环TTA**,其中环境不仅随时间变化,还会周期性重现,形成大规模数据流。该设置使我们能够在最基础的场景中(即模型定期暴露于先前测试环境时)检验TTA模型的误差累积现象。进一步地,我们通过一个简单但具有代表性的$\epsilon$-**扰动高斯混合模型分类器**模拟TTA过程,从理论上揭示了导致性能渐进衰退的数据集依赖与算法依赖因素。基于此研究,我们提出**持续性TTA(PeTTA)**,该方法能感知模型何时趋于崩溃,并动态调整适应策略,在环境适应与防止模型崩溃的双重目标间取得平衡。在多种基准测试的综合性实验中,PeTTA在面对终身TTA场景时展现出超越现有方法的卓越稳定性。