medigan: a Python library of pretrained generative models for medical image synthesis

Richard Osuala,Grzegorz Skorupko,Noussair Lazrak,Lidia Garrucho,Eloy García,Smriti Joshi,Socayna Jouide,Michael Rutherford,Fred Prior,Kaisar Kushibar,Oliver Diaz,Karim Lekadir

from arxiv, 32 pages, 7 figures

Synthetic data generated by generative models can enhance the performance and capabilities of data-hungry deep learning models in medical imaging. However, there is (1) limited availability of (synthetic) datasets and (2) generative models are complex to train, which hinders their adoption in research and clinical applications. To reduce this entry barrier, we propose medigan, a one-stop shop for pretrained generative models implemented as an open-source framework-agnostic Python library. medigan allows researchers and developers to create, increase, and domain-adapt their training data in just a few lines of code. Guided by design decisions based on gathered end-user requirements, we implement medigan based on modular components for generative model (i) execution, (ii) visualisation, (iii) search & ranking, and (iv) contribution. The library's scalability and design is demonstrated by its growing number of integrated and readily-usable pretrained generative models consisting of 21 models utilising 9 different Generative Adversarial Network architectures trained on 11 datasets from 4 domains, namely, mammography, endoscopy, x-ray, and MRI. Furthermore, 3 applications of medigan are analysed in this work, which include (a) enabling community-wide sharing of restricted data, (b) investigating generative model evaluation metrics, and (c) improving clinical downstream tasks. In (b), extending on common medical image synthesis assessment and reporting standards, we show Fr\'echet Inception Distance variability based on image normalisation and radiology-specific feature extraction.

翻译：生成模型生成的合成数据可以提升医学影像中数据密集型的深度学习模型的性能和能力。然而，目前存在（1）合成数据集的可用性有限，以及（2）生成模型训练复杂的问题，这阻碍了它们在研究和临床中的应用。为降低这一门槛，我们提出medigan，这是一个一站式预训练生成模型库，实现为开源且框架无关的Python库。medigan使研究人员和开发者仅需几行代码即可创建、扩充和领域适配其训练数据。基于收集的终端用户需求的设计决策，我们通过模块化组件实现了medigan的生成模型（i）执行、（ii）可视化、（iii）搜索与排序，以及（iv）贡献功能。该库的可扩展性和设计通过其集成且可立即使用的预训练生成模型数量的增长得到验证，其中包括21个模型，这些模型使用了9种不同的生成对抗网络架构，在来自4个领域（即乳腺摄影、内窥镜、X射线和MRI）的11个数据集上进行了训练。此外，本文分析了medigan的3种应用，包括（a）实现受限数据的社区共享、（b）研究生成模型评估指标，以及（c）改善临床下游任务。在（b）中，我们扩展了常见的医学图像合成评估和报告标准，展示了基于图像归一化和放射学特定特征提取的Fréchet初始距离变异。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【2023新书】使用Python进行统计和数据可视化，554页pdf

专知会员服务

130+阅读 · 2023年1月29日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日