Text-guided medical image editing must satisfy the requested pathology while preserving anatomy, modality-specific appearance, and clinical plausibility. However, existing datasets largely supervise editors with final accepted edits and discard the failed attempts produced during generation. We argue that these failures provide essential supervision for quality control: they specify what should be rejected, why an edit is medically or visually invalid, and how the instruction should be revised. We present Med-Banana, a trajectory-supervised framework for quality-controlled medical image editing. We introduce Med-Banana-80K, a large-scale resource of success-and-failure editing trajectories with candidate images, verification outcomes, rejection reasons, and prompt refinements. Building on it, Med-Banana jointly trains an editor, verifier, and refiner, enabling edit--verify--refine inference from accepted and rejected attempts. Experiments across MLLM judges, blind expert assessment, source-preservation and real--synthetic separability probes demonstrate consistent improvements over open medical image editors. Code and data are publicly available.
翻译:文本引导的医学图像编辑需在满足指定病理要求的同时,保持解剖结构完整性、模态特异性外观及临床合理性。然而现有数据集主要利用最终被采纳的编辑结果训练编辑器,而丢弃了生成过程中产生的失败尝试。我们认为这些失败案例为质量控制提供了关键监督信号:它们明确指出了应被拒绝的内容、医学或视觉层面无效编辑的原因,以及指令应如何修正。本文提出Med-Banana,一个基于轨迹监督的质量控制医学图像编辑框架。我们构建了包含成功与失败编辑轨迹的大规模资源库Med-Banana-80K,涵盖候选图像、验证结果、拒绝原因及指令优化等信息。基于此资源,Med-Banana联合训练编辑器、验证器与优化器,实现从已采纳和已拒绝尝试中学习编辑-验证-优化推理。在多模态大语言模型评估、盲审专家评价、源域保持性检测以及真实-合成数据可分性探针等实验中,本方法相较于现有开源医学图像编辑器展现出持续性能提升。代码与数据均已公开。