Despite the progress in medical image segmentation, most existing methods remain task-specific and lack interactivity. Although recent text-prompt-based segmentation approaches enhance user-driven and reasoning-based segmentation, they remain confined to single-round dialogues and fail to perform multi-round reasoning. In this work, we introduce Multi-Round Entity-Level Medical Reasoning Segmentation (MEMR-Seg), a new task that requires generating segmentation masks through multi-round queries with entity-level reasoning. To support this task, we construct MR-MedSeg, a large-scale dataset of 177K multi-round medical segmentation dialogues, featuring entity-based reasoning across rounds. Furthermore, we propose MediRound, an effective baseline model designed for multi-round medical reasoning segmentation. To mitigate the inherent error propagation in the chain-like pipeline of multi-round segmentation, we introduce a lightweight yet effective Judgment & Correction Mechanism during model inference. Experimental results demonstrate that our method effectively addresses the MEMR-Seg task and outperforms conventional medical referring segmentation methods.
翻译:尽管医学图像分割取得了进展,但现有方法大多仍局限于特定任务且缺乏交互性。尽管近期基于文本提示的分割方法增强了用户驱动和基于推理的分割能力,但它们仍局限于单轮对话,无法执行多轮推理。本文中,我们提出了多轮实体级医学推理分割这一新任务,该任务要求通过具有实体级推理的多轮查询来生成分割掩码。为支持此任务,我们构建了MR-MedSeg——一个包含17.7万轮医学分割对话的大规模数据集,其特点在于跨轮次的基于实体的推理。此外,我们提出了MediRound,一个为多轮医学推理分割设计的有效基线模型。为缓解多轮分割链式流程中固有的错误传播问题,我们在模型推理过程中引入了一种轻量级但有效的判断与校正机制。实验结果表明,我们的方法能有效解决MEMR-Seg任务,并优于传统的医学指代分割方法。