Interactive segmentation uses real-time user inputs, such as mouse clicks, to iteratively refine model predictions. Although not originally designed to address distribution shifts, this paradigm naturally lends itself to such challenges. In medical imaging, where distribution shifts are common, interactive methods can use user inputs to guide models towards improved predictions. Moreover, once a model is deployed, user corrections can be used to adapt the network parameters to the new data distribution, mitigating distribution shift. Based on these insights, we aim to develop a practical, effective method for improving the adaptive capabilities of interactive segmentation models to new data distributions in medical imaging. Firstly, we found that strengthening the model's responsiveness to clicks is important for the initial training process. Moreover, we show that by treating the post-interaction user-refined model output as pseudo-ground-truth, we can design a lean, practical online adaptation method that enables a model to learn effectively across sequential test images. The framework includes two components: (i) a Post-Interaction adaptation process, updating the model after the user has completed interactive refinement of an image, and (ii) a Mid-Interaction adaptation process, updating incrementally after each click. Both processes include a Click-Centered Gaussian loss that strengthens the model's reaction to clicks and enhances focus on user-guided, clinically relevant regions. Experiments on 5 fundus and 4 brain-MRI databases show that our approach consistently outperforms existing methods under diverse distribution shifts, including unseen imaging modalities and pathologies. Code and pretrained models will be released upon publication.
翻译:交互式分割利用实时用户输入(如鼠标点击)迭代优化模型预测。尽管其初衷并非解决分布偏移问题,但该范式天然适用于此类挑战。在分布偏移常见的医学图像领域,交互式方法能借助用户输入引导模型改进预测。此外,模型部署后,用户修正可用于调整网络参数以适应新数据分布,从而缓解分布偏移。基于这些发现,我们旨在开发一种实用且有效的方法,提升交互式分割模型对医学图像新数据分布的适应能力。首先,我们发现增强模型对点击的响应性对初始训练过程至关重要。进一步,我们将用户交互后模型输出的精炼结果视为伪真实标签,从而设计出一种简洁实用的在线适应方法,使模型能有效学习连续测试图像。该框架包含两个组件:(i) 交互后适应过程,在用户完成单张图像的交互式精炼后更新模型;(ii) 交互中适应过程,在每次点击后增量更新。两个过程均采用点击中心高斯损失函数,增强模型对点击的反应,并强化对用户导向的临床相关区域的关注。在5个眼底数据集和4个脑部MRI数据集上的实验表明,我们的方法在多种分布偏移场景(包括未见过的成像模态和病理类型)中持续优于现有方法。代码与预训练模型将于论文发表后公开。