The scarcity of high-quality data remains a primary bottleneck in adapting multimodal generative models for medical image editing. Existing medical image editing datasets often suffer from limited diversity, neglect of medical image understanding and inability to balance quality with scalability. To address these gaps, we propose MieDB-100k, a large-scale, high-quality and diverse dataset for text-guided medical image editing. It categorizes editing tasks into perspectives of Perception, Modification and Transformation, considering both understanding and generation abilities. We construct MieDB-100k via a data curation pipeline leveraging both modality-specific expert models and rule-based data synthetic methods, followed by rigorous manual inspection to ensure clinical fidelity. Extensive experiments demonstrate that model trained with MieDB-100k consistently outperform both open-source and proprietary models while exhibiting strong generalization ability. We anticipate that this dataset will serve as a cornerstone for future advancements in specialized medical image editing.
翻译:高质量数据的稀缺仍然是调整多模态生成模型以适应医学图像编辑的主要瓶颈。现有的医学图像编辑数据集通常存在多样性有限、忽视医学图像理解以及难以平衡质量与可扩展性的问题。为弥补这些不足,我们提出了MieDB-100k,一个用于文本引导医学图像编辑的大规模、高质量且多样化的数据集。它将编辑任务从感知、修改和转换三个视角进行分类,同时考虑了理解与生成能力。我们通过一个数据策展流程构建了MieDB-100k,该流程利用了特定模态的专家模型和基于规则的数据合成方法,并随后进行了严格的人工检查以确保临床保真度。大量实验表明,使用MieDB-100k训练的模型在表现出强大泛化能力的同时,其性能始终优于开源和专有模型。我们预计该数据集将成为未来专业医学图像编辑领域发展的基石。