Deep learning-based medical image segmentation models often face performance degradation when deployed across various medical centers, largely due to the discrepancies in data distribution. Test Time Adaptation (TTA) methods, which adapt pre-trained models to test data, have been employed to mitigate such discrepancies. However, existing TTA methods primarily focus on manipulating Batch Normalization (BN) layers or employing prompt and adversarial learning, which may not effectively rectify the inconsistencies arising from divergent data distributions. In this paper, we propose a novel Human-in-the-loop TTA (HiTTA) framework that stands out in two significant ways. First, it capitalizes on the largely overlooked potential of clinician-corrected predictions, integrating these corrections into the TTA process to steer the model towards predictions that coincide more closely with clinical annotation preferences. Second, our framework conceives a divergence loss, designed specifically to diminish the prediction divergence instigated by domain disparities, through the careful calibration of BN parameters. Our HiTTA is distinguished by its dual-faceted capability to acclimatize to the distribution of test data whilst ensuring the model's predictions align with clinical expectations, thereby enhancing its relevance in a medical context. Extensive experiments on a public dataset underscore the superiority of our HiTTA over existing TTA methods, emphasizing the advantages of integrating human feedback and our divergence loss in enhancing the model's performance and adaptability across diverse medical centers.
翻译:基于深度学习的医学图像分割模型在跨医疗中心部署时常面临性能下降问题,这主要源于数据分布的差异。测试时自适应方法通过将预训练模型适配至测试数据以缓解此类差异,然而现有方法主要聚焦于批归一化层调整或引入提示学习与对抗学习,难以有效纠正由数据分布差异引发的预测不一致性。本文提出一种新颖的人机协同测试时自适应框架HiTTA,其创新性体现在两个方面:首先,该框架挖掘了临床医师修正预测中常被忽视的潜力,通过将这些修正融入测试时自适应过程,引导模型生成更符合临床标注偏好的预测;其次,我们设计了面向发散性损失的损失函数,通过精细校准批归一化参数来削弱域差异引发的预测发散。HiTTA具有双重能力:既能自适应测试数据分布,又能确保模型预测与临床预期保持一致,从而提升其在医学场景中的实用性。在公开数据集上的大量实验表明,HiTTA优于现有方法,凸显了引入人工反馈与发散性损失在增强模型跨医疗中心性能与适应性方面的优势。