Grading inflammatory bowel disease (IBD) activity using standardized histopathological scoring systems remains challenging due to resource constraints and inter-observer variability. In this study, we developed a deep learning model to classify activity grades in hematoxylin and eosin-stained whole slide images (WSIs) from patients with IBD, offering a robust approach for general pathologists. We utilized 2,077 WSIs from 636 patients treated at Dartmouth-Hitchcock Medical Center in 2018 and 2019, scanned at 40x magnification (0.25 micron/pixel). Board-certified gastrointestinal pathologists categorized the WSIs into four activity classes: inactive, mildly active, moderately active, and severely active. A transformer-based model was developed and validated using five-fold cross-validation to classify IBD activity. Using HoVerNet, we examined neutrophil distribution across activity grades. Attention maps from our model highlighted areas contributing to its prediction. The model classified IBD activity with weighted averages of 0.871 [95% Confidence Interval (CI): 0.860-0.883] for the area under the curve, 0.695 [95% CI: 0.674-0.715] for precision, 0.697 [95% CI: 0.678-0.716] for recall, and 0.695 [95% CI: 0.674-0.714] for F1-score. Neutrophil distribution was significantly different across activity classes. Qualitative evaluation of attention maps by a gastrointestinal pathologist suggested their potential for improved interpretability. Our model demonstrates robust diagnostic performance and could enhance consistency and efficiency in IBD activity assessment.
翻译:使用标准化的组织病理学评分系统对炎症性肠病(IBD)活动度进行分级,由于资源限制和观察者间差异,仍然具有挑战性。在本研究中,我们开发了一种深度学习模型,用于对IBD患者的苏木精-伊红染色全切片图像(WSIs)中的活动度等级进行分类,为普通病理学家提供了一种稳健的方法。我们使用了2018年和2019年在达特茅斯-希区柯克医疗中心接受治疗的636名患者的2,077张WSIs,扫描倍率为40倍(0.25微米/像素)。经委员会认证的胃肠道病理学家将WSIs分为四个活动度等级:非活动性、轻度活动性、中度活动性和重度活动性。我们开发并验证了一个基于Transformer的模型,使用五折交叉验证对IBD活动度进行分类。利用HoVerNet,我们检查了不同活动度等级间中性粒细胞的分布情况。我们模型的注意力图突出了对其预测有贡献的区域。该模型对IBD活动度进行分类的性能指标加权平均值如下:曲线下面积为0.871 [95%置信区间(CI):0.860-0.883],精确率为0.695 [95% CI:0.674-0.715],召回率为0.697 [95% CI:0.678-0.716],F1分数为0.695 [95% CI:0.674-0.714]。不同活动度等级间中性粒细胞分布存在显著差异。胃肠道病理学家对注意力图的定性评估表明其具有提高模型可解释性的潜力。我们的模型展现了稳健的诊断性能,并可能提高IBD活动度评估的一致性和效率。