Objective: To develop and evaluate a deep radiomics model for clinically significant prostate cancer (csPCa, grade group >= 2) detection and compare its performance to Prostate Imaging Reporting and Data System (PI-RADS) assessment in a multicenter cohort. Materials and Methods: This retrospective study analyzed biparametric (T2W and DW) prostate MRI sequences of 615 patients (mean age, 63.1 +/- 7 years) from four datasets acquired between 2010 and 2020: PROSTATEx challenge, Prostate158 challenge, PCaMAP trial, and an in-house (NTNU/St. Olavs Hospital) dataset. With expert annotations as ground truth, a deep radiomics model was trained, including nnU-Net segmentation of the prostate gland, voxel-wise radiomic feature extraction, extreme gradient boost classification, and post-processing of tumor probability maps into csPCa detection maps. Training involved 5-fold cross-validation using the PROSTATEx (n=199), Prostate158 (n=138), and PCaMAP (n=78) datasets, and testing on the in-house (n=200) dataset. Patient- and lesion-level performance were compared to PI-RADS using area under ROC curve (AUROC [95% CI]), sensitivity, and specificity analysis. Results: On the test data, the radiologist achieved a patient-level AUROC of 0.94 [0.91-0.98] with 94% (75/80) sensitivity and 77% (92/120) specificity at PI-RADS >= 3. The deep radiomics model at a tumor probability cut-off >= 0.76 achieved 0.91 [0.86-0.95] AUROC with 90% (72/80) sensitivity and 73% (87/120) specificity, not significantly different (p = 0.068) from PI-RADS. On the lesion level, PI-RADS cut-off >= 3 had 84% (91/108) sensitivity at 0.2 (40/200) false positives per patient, while deep radiomics attained 68% (73/108) sensitivity at the same false positive rate. Conclusion: Deep radiomics machine learning model achieved comparable performance to PI-RADS assessment in csPCa detection at the patient-level but not at the lesion-level.
翻译:目的:开发并评估一种用于检测临床显著性前列腺癌(csPCa,分级组≥2)的深度影像组学模型,并在多中心队列中将其性能与前列腺影像报告和数据系统(PI-RADS)评估进行比较。材料与方法:这项回顾性研究分析了来自四个数据集(采集于2010年至2020年间)的615名患者(平均年龄63.1±7岁)的双参数(T2W和DW)前列腺MRI序列:PROSTATEx挑战赛、Prostate158挑战赛、PCaMAP试验以及一个内部(挪威科技大学/圣奥拉夫医院)数据集。以专家标注为金标准,训练了一个深度影像组学模型,包括使用nnU-Net分割前列腺、逐体素影像组学特征提取、极限梯度提升分类,以及将肿瘤概率图后处理为csPCa检测图。训练采用5折交叉验证,使用PROSTATEx(n=199)、Prostate158(n=138)和PCaMAP(n=78)数据集,并在内部(n=200)数据集上进行测试。使用受试者工作特征曲线下面积(AUROC [95% CI])、敏感性和特异性分析,在患者层面和病灶层面将模型性能与PI-RADS进行比较。结果:在测试数据上,放射科医生在患者层面(PI-RADS ≥ 3)的AUROC为0.94 [0.91-0.98],敏感性为94%(75/80),特异性为77%(92/120)。深度影像组学模型在肿瘤概率截断值≥0.76时,AUROC为0.91 [0.86-0.95],敏感性为90%(72/80),特异性为73%(87/120),与PI-RADS无显著差异(p = 0.068)。在病灶层面,PI-RADS截断值≥3的敏感性为84%(91/108),每例患者假阳性数为0.2(40/200);而深度影像组学在相同假阳性率下的敏感性为68%(73/108)。结论:深度影像组学机器学习模型在患者层面的csPCa检测中取得了与PI-RADS评估相当的性能,但在病灶层面则不然。