Continual Test-Time Adaptation (CTTA) aims to adapt a pretrained model to ever-changing environments during the test time under continuous domain shifts. Most existing CTTA approaches are based on the Mean Teacher (MT) structure, which contains a student and a teacher model, where the student is updated using the pseudo-labels from the teacher model, and the teacher is then updated by exponential moving average strategy. However, these methods update the MT model indiscriminately on all parameters of the model. That is, some critical parameters involving sharing knowledge across different domains may be erased, intensifying error accumulation and catastrophic forgetting. In this paper, we introduce Parameter-Selective Mean Teacher (PSMT) method, which is capable of effectively updating the critical parameters within the MT network under domain shifts. First, we introduce a selective distillation mechanism in the student model, which utilizes past knowledge to regularize novel knowledge, thereby mitigating the impact of error accumulation. Second, to avoid catastrophic forgetting, in the teacher model, we create a mask through Fisher information to selectively update parameters via exponential moving average, with preservation measures applied to crucial parameters. Extensive experimental results verify that PSMT outperforms state-of-the-art methods across multiple benchmark datasets. Our code is available at \url{https://github.com/JiaxuTian/PSMT}.
翻译:持续测试时自适应(CTTA)旨在在连续域偏移的测试阶段,将预训练模型适应于不断变化的环境。现有的大多数CTTA方法基于均值教师(MT)结构,该结构包含学生模型和教师模型,其中学生模型利用教师模型生成的伪标签进行更新,而教师模型则通过指数移动平均策略进行更新。然而,这些方法不加区分地更新模型的所有参数。这意味着,涉及跨域共享知识的关键参数可能被擦除,从而加剧错误累积和灾难性遗忘。本文提出参数选择性均值教师(PSMT)方法,该方法能够在域偏移下有效更新MT网络中的关键参数。首先,我们在学生模型中引入选择性蒸馏机制,利用过往知识来正则化新知识,从而减轻错误累积的影响。其次,为避免灾难性遗忘,我们在教师模型中通过Fisher信息创建掩码,以指数移动平均方式选择性更新参数,并对关键参数实施保护措施。大量实验结果验证了PSMT在多个基准数据集上优于现有最先进方法。我们的代码公开于 \url{https://github.com/JiaxuTian/PSMT}。