A3-TTA: Adaptive Anchor Alignment Test-Time Adaptation for Image Segmentation

Test-Time Adaptation (TTA) offers a practical solution for deploying image segmentation models under domain shift without accessing source data or retraining. Among existing TTA strategies, pseudo-label-based methods have shown promising performance. However, they often rely on perturbation-ensemble heuristics (e.g., dropout sampling, test-time augmentation, Gaussian noise), which lack distributional grounding and yield unstable training signals. This can trigger error accumulation and catastrophic forgetting during adaptation. To address this, we propose \textbf{A3-TTA}, a TTA framework that constructs reliable pseudo-labels through anchor-guided supervision. Specifically, we identify well-predicted target domain images using a class compact density metric, under the assumption that confident predictions imply distributional proximity to the source domain. These anchors serve as stable references to guide pseudo-label generation, which is further regularized via semantic consistency and boundary-aware entropy minimization. Additionally, we introduce a self-adaptive exponential moving average strategy to mitigate label noise and stabilize model update during adaptation. Evaluated on both multi-domain medical images (heart structure and prostate segmentation) and natural images, A3-TTA significantly improves average Dice scores by 10.40 to 17.68 percentage points compared to the source model, outperforming several state-of-the-art TTA methods under different segmentation model architectures. A3-TTA also excels in continual TTA, maintaining high performance across sequential target domains with strong anti-forgetting ability. The code will be made publicly available at https://github.com/HiLab-git/A3-TTA.

翻译：测试时适应（TTA）为在领域偏移下部署图像分割模型提供了一种实用解决方案，无需访问源数据或重新训练。在现有的TTA策略中，基于伪标签的方法已展现出有前景的性能。然而，这些方法通常依赖于扰动集成启发式策略（例如，随机丢弃采样、测试时数据增强、高斯噪声），这些策略缺乏分布基础，并产生不稳定的训练信号，从而可能在适应过程中引发误差累积与灾难性遗忘。为解决此问题，我们提出 \textbf{A3-TTA}，一种通过锚点引导监督构建可靠伪标签的TTA框架。具体而言，我们基于“置信预测意味着与源域分布接近”的假设，使用类紧凑密度度量来识别预测良好的目标域图像。这些锚点作为稳定参考来指导伪标签生成，并通过语义一致性与边界感知熵最小化进行进一步正则化。此外，我们引入一种自适应指数移动平均策略，以减轻标签噪声并在适应过程中稳定模型更新。在多领域医学图像（心脏结构与前列腺分割）和自然图像上的评估表明，与源模型相比，A3-TTA 将平均 Dice 分数显著提高了 10.40 至 17.68 个百分点，在不同分割模型架构下均优于多种先进的 TTA 方法。A3-TTA 在持续 TTA 场景中也表现优异，在连续目标域上保持高性能，并展现出强大的抗遗忘能力。代码将在 https://github.com/HiLab-git/A3-TTA 公开。