The complex heterogeneity of brain tumours is increasingly recognized to demand data of magnitudes and richness only fully-inclusive, large-scale collections drawn from routine clinical care could plausibly offer. This is a task contemporary machine learning could facilitate, especially in neuroimaging, but its ability to deal with incomplete data common in real world clinical practice remains unknown. Here we apply state-of-the-art methods to large scale, multi-site MRI data to quantify the comparative fidelity of automated tumour segmentation models replicating the various levels of sequence availability observed in the clinical reality. We compare deep learning (nnU-Net-derived) segmentation models with all possible combinations of T1, contrast-enhanced T1, T2, and FLAIR sequences, trained and validated with five-fold cross-validation on the 2021 BraTS-RSNA glioma population of 1251 patients, with further testing on a real-world 50 patient sample diverse in not only MRI scanner and field strength, but a random selection of pre- and post-operative imaging also. Models trained on incomplete imaging data segmented lesions well, often equivalently to those trained on complete data, exhibiting Dice coefficients of 0.907 (single sequence) to 0.945 (full datasets) for whole tumours, and 0.701 (single sequence) to 0.891 (full datasets) for component tissue types. Incomplete data segmentation models could accurately detect enhancing tumour in the absence of contrast imaging, quantifying its volume with an R2 between 0.95-0.97, and were invariant to lesion morphometry. Deep learning segmentation models characterize tumours well when missing data and can even detect enhancing tissue without the use of contrast. This suggests translation to clinical practice, where incomplete data is common, may be easier than hitherto believed, and may be of value in reducing dependence on contrast use.
翻译:脑肿瘤的复杂异质性日益被认为需要来自常规临床护理的全方位、大规模数据集合才能充分体现其丰富性。这是当代机器学习能够促进的任务,尤其是在神经影像领域,但其应对真实临床实践中常见的不完整数据的能力仍属未知。本研究将最先进方法应用于大规模多中心MRI数据,量化模拟临床现实中不同序列可用性水平的自动肿瘤分割模型的保真度。我们比较了基于深度学习(nnU-Net衍生)的分割模型,涵盖T1、增强T1、T2和FLAIR序列的所有可能组合,在2021年BraTS-RSNA胶质瘤人群(1251例患者)上采用五折交叉验证进行训练和验证,并在真实世界50例患者样本上进一步测试——该样本不仅涵盖不同MRI扫描仪和场强,还随机包含术前和术后影像。基于不完整影像数据训练的模型能够良好分割病灶,其性能通常与完整数据训练的模型相当:全肿瘤的Dice系数范围为0.907(单序列)至0.945(全数据集),而组织成分分类的Dice系数为0.701(单序列)至0.891(全数据集)。不完整数据分割模型能在无对比剂影像的情况下准确检测增强肿瘤,其体积量化R²值为0.95-0.97,且对病灶形态学特征具有不变性。深度学习分割模型在数据缺失时仍能有效表征肿瘤,甚至无需使用对比剂即可检测增强组织。这表明向数据不完整普遍存在的临床实践转化可能比预期更易实现,并可能在减少对比剂依赖方面具有价值。