Purpose: Multi-expert deep learning training methods to automatically quantify ischemic brain tissue on Non-Contrast CT Materials and Methods: The data set consisted of 260 Non-Contrast CTs from 233 patients of acute ischemic stroke patients recruited in the DEFUSE 3 trial. A benchmark U-Net was trained on the reference annotations of three experienced neuroradiologists to segment ischemic brain tissue using majority vote and random expert sampling training schemes. We used a one-sided Wilcoxon signed-rank test on a set of segmentation metrics to compare bootstrapped point estimates of the training schemes with the inter-expert agreement and ratio of variance for consistency analysis. We further compare volumes with the 24h-follow-up DWI (final infarct core) in the patient subgroup with full reperfusion and we test volumes for correlation to the clinical outcome (mRS after 30 and 90 days) with the Spearman method. Results: Random expert sampling leads to a model that shows better agreement with experts than experts agree among themselves and better agreement than the agreement between experts and a majority-vote model performance (Surface Dice at Tolerance 5mm improvement of 61% to 0.70 +- 0.03 and Dice improvement of 25% to 0.50 +- 0.04). The model-based predicted volume similarly estimated the final infarct volume and correlated better to the clinical outcome than CT perfusion. Conclusion: A model trained on random expert sampling can identify the presence and location of acute ischemic brain tissue on Non-Contrast CT similar to CT perfusion and with better consistency than experts. This may further secure the selection of patients eligible for endovascular treatment in less specialized hospitals.
翻译:目的:采用多专家深度学习训练方法自动量化非增强CT上的缺血性脑组织。材料与方法:数据集包含来自DEFUSE 3试验中233例急性缺血性脑卒中患者的260例非增强CT。基于三位资深神经放射科医生的参考标注,使用多数投票和随机专家采样训练方案训练基准U-Net模型分割缺血性脑组织。我们采用单侧Wilcoxon符号秩检验对分割指标集合进行比较,分析训练方案的bootstrap点估计与专家间一致性及方差比的一致性。进一步在完全再灌注的患者亚组中,将体积与24小时随访DWI(最终梗死核心)进行比较,并通过Spearman法检验体积与临床结局(30天和90天后的mRS)的相关性。结果:随机专家采样训练出的模型与专家的一致性优于专家自身间的一致性,且优于专家与多数投票模型性能的一致性(容差5mm的表面Dice从0.61提升至0.70±0.03,Dice从0.25提升至0.50±0.04)。基于模型预测的体积与最终梗死体积估计值相似,且与临床结局的相关性优于CT灌注。结论:采用随机专家采样训练的模型能够在非增强CT上识别急性缺血性脑组织的存在及位置,效果与CT灌注相当,且一致性优于专家。这可能进一步提升在非专科医院中筛选适合血管内治疗患者的可靠性。