Parameter-Selective Continual Test-Time Adaptation

Continual Test-Time Adaptation (CTTA) aims to adapt a pretrained model to ever-changing environments during the test time under continuous domain shifts. Most existing CTTA approaches are based on the Mean Teacher (MT) structure, which contains a student and a teacher model, where the student is updated using the pseudo-labels from the teacher model, and the teacher is then updated by exponential moving average strategy. However, these methods update the MT model indiscriminately on all parameters of the model. That is, some critical parameters involving sharing knowledge across different domains may be erased, intensifying error accumulation and catastrophic forgetting. In this paper, we introduce Parameter-Selective Mean Teacher (PSMT) method, which is capable of effectively updating the critical parameters within the MT network under domain shifts. First, we introduce a selective distillation mechanism in the student model, which utilizes past knowledge to regularize novel knowledge, thereby mitigating the impact of error accumulation. Second, to avoid catastrophic forgetting, in the teacher model, we create a mask through Fisher information to selectively update parameters via exponential moving average, with preservation measures applied to crucial parameters. Extensive experimental results verify that PSMT outperforms state-of-the-art methods across multiple benchmark datasets. Our code is available at \url{https://github.com/JiaxuTian/PSMT}.

翻译：持续测试时自适应（CTTA）旨在在连续域偏移的测试阶段，将预训练模型适应于不断变化的环境。现有的大多数CTTA方法基于均值教师（MT）结构，该结构包含学生模型和教师模型，其中学生模型利用教师模型生成的伪标签进行更新，而教师模型则通过指数移动平均策略进行更新。然而，这些方法不加区分地更新模型的所有参数。这意味着，涉及跨域共享知识的关键参数可能被擦除，从而加剧错误累积和灾难性遗忘。本文提出参数选择性均值教师（PSMT）方法，该方法能够在域偏移下有效更新MT网络中的关键参数。首先，我们在学生模型中引入选择性蒸馏机制，利用过往知识来正则化新知识，从而减轻错误累积的影响。其次，为避免灾难性遗忘，我们在教师模型中通过Fisher信息创建掩码，以指数移动平均方式选择性更新参数，并对关键参数实施保护措施。大量实验结果验证了PSMT在多个基准数据集上优于现有最先进方法。我们的代码公开于 \url{https://github.com/JiaxuTian/PSMT}。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日