Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. It allows a base model to adapt to an unforeseen distribution during inference by leveraging the information from the batch of (unlabeled) test data. However, we uncover a novel security vulnerability of TTA based on the insight that predictions on benign samples can be impacted by malicious samples in the same batch. To exploit this vulnerability, we propose Distribution Invading Attack (DIA), which injects a small fraction of malicious data into the test batch. DIA causes models using TTA to misclassify benign and unperturbed test data, providing an entirely new capability for adversaries that is infeasible in canonical machine learning pipelines. Through comprehensive evaluations, we demonstrate the high effectiveness of our attack on multiple benchmarks across six TTA methods. In response, we investigate two countermeasures to robustify the existing insecure TTA implementations, following the principle of "security by design". Together, we hope our findings can make the community aware of the utility-security tradeoffs in deploying TTA and provide valuable insights for developing robust TTA approaches.
翻译:最近,测试时自适应(TTA)被提出作为应对分布偏移的一种有前景的解决方案。它允许基础模型在推理过程中利用(未标注的)测试数据批次的信息,适应未预见的分布。然而,我们基于“同一批次中恶意样本可影响良性样本预测”的洞察,揭示了TTA的一种新型安全漏洞。为利用这一漏洞,我们提出分布入侵攻击(DIA),该方法向测试批次注入少量恶意数据。DIA使得使用TTA的模型对良性且未被扰动的测试数据产生误分类,为对手提供了在传统机器学习流程中无法实现的全新攻击能力。通过全面评估,我们证明了该攻击在六种TTA方法的多项基准测试中具有极高的有效性。为此,我们遵循“安全设计”原则,研究了两类强化现有不安全TTA实现的防御措施。我们希望这些发现能让学界认识到部署TTA时存在的效用-安全性权衡,并为开发稳健的TTA方法提供宝贵见解。