The emergence of clinical data warehouses (CDWs), which contain the medical data of millions of patients, has paved the way for vast data sharing for research. The quality of MRIs gathered in CDWs differs greatly from what is observed in research settings and reflects a certain clinical reality. Consequently, a significant proportion of these images turns out to be unusable due to their poor quality. Given the massive volume of MRIs contained in CDWs, the manual rating of image quality is impossible. Thus, it is necessary to develop an automated solution capable of effectively identifying corrupted images in CDWs. This study presents an innovative transfer learning method for automated quality control of 3D gradient echo T1-weighted brain MRIs within a CDW, leveraging artefact simulation. We first intentionally corrupt images from research datasets by inducing poorer contrast, adding noise and introducing motion artefacts. Subsequently, three artefact-specific models are pre-trained using these corrupted images to detect distinct types of artefacts. Finally, the models are generalised to routine clinical data through a transfer learning technique, utilising 3660 manually annotated images. The overall image quality is inferred from the results of the three models, each designed to detect a specific type of artefact. Our method was validated on an independent test set of 385 3D gradient echo T1-weighted MRIs. Our proposed approach achieved excellent results for the detection of bad quality MRIs, with a balanced accuracy of over 87%, surpassing our previous approach by 3.5 percent points. Additionally, we achieved a satisfactory balanced accuracy of 79% for the detection of moderate quality MRIs, outperforming our previous performance by 5 percent points. Our framework provides a valuable tool for exploiting the potential of MRIs in CDWs.
翻译:临床数据仓库(CDWs)的出现,为海量医疗数据共享研究开辟了道路,这些仓库存储着数百万患者的医疗数据。CDWs中收集的MRI质量与研究环境中观察到的图像存在显著差异,反映了特定的临床现实。因此,由于质量低劣,这些图像中有相当一部分最终无法使用。鉴于CDWs中包含的MRI数量庞大,手动评估图像质量是不可能的。因此,有必要开发一种自动化解决方案,能够有效识别CDWs中的损坏图像。本研究提出了一种创新的迁移学习方法,用于在CDWs中对3D梯度回波T1加权脑部MRI进行自动化质量控制,该方法利用了伪影模拟技术。我们首先通过降低对比度、添加噪声和引入运动伪影,有意地破坏研究数据集中的图像。随后,使用这些损坏的图像预训练三个针对特定伪影的模型,以检测不同类型的伪影。最后,通过迁移学习技术,利用3660张手动标注的图像,将这些模型泛化到常规临床数据中。整体图像质量是根据三个模型的结果推断得出的,每个模型都设计用于检测特定类型的伪影。我们的方法在一个包含385张3D梯度回波T1加权MRI的独立测试集上进行了验证。我们提出的方法在检测低质量MRI方面取得了优异的结果,平衡准确率超过87%,比我们之前的方法提高了3.5个百分点。此外,在检测中等质量MRI方面,我们取得了79%的令人满意的平衡准确率,比我们之前的性能提高了5个百分点。我们的框架为挖掘CDWs中MRI的潜力提供了一个有价值的工具。