Test-time adaptation (TTA) refers to adapting a trained model to a new domain during testing. Existing TTA techniques rely on having multiple test images from the same domain, yet this may be impractical in real-world applications such as medical imaging, where data acquisition is expensive and imaging conditions vary frequently. Here, we approach such a task, of adapting a medical image segmentation model with only a single unlabeled test image. Most TTA approaches, which directly minimize the entropy of predictions, fail to improve performance significantly in this setting, in which we also observe the choice of batch normalization (BN) layer statistics to be a highly important yet unstable factor due to only having a single test domain example. To overcome this, we propose to instead \textit{integrate} over predictions made with various estimates of target domain statistics between the training and test statistics, weighted based on their entropy statistics.
翻译:测试时自适应(TTA)是指在测试阶段将预训练模型调整至新域。现有TTA技术依赖同一域中具有多张测试图像的假设,然而这在医学成像等实际应用中可能难以实现——数据采集成本高昂且成像条件频繁变化。本文针对仅有一张无标签测试图像时医学图像分割模型的自适应任务展开研究。大多数TTA方法通过直接最小化预测熵来提升性能,但在本场景中效果显著受限;同时我们发现,由于仅具备单一测试域样本,批归一化(BN)层统计量的选择成为高度重要却极不稳定的因素。为解决这一问题,我们提出对基于训练统计量与测试统计量之间多种目标域统计量估计值所生成的预测结果进行\textit{集成},并根据其熵统计量进行加权。