Deep Learning for Classification of Inflammatory Bowel Disease Activity in Whole Slide Images of Colonic Histopathology

Grading inflammatory bowel disease (IBD) activity using standardized histopathological scoring systems remains challenging due to resource constraints and inter-observer variability. In this study, we developed a deep learning model to classify activity grades in hematoxylin and eosin-stained whole slide images (WSIs) from patients with IBD, offering a robust approach for general pathologists. We utilized 2,077 WSIs from 636 patients treated at Dartmouth-Hitchcock Medical Center in 2018 and 2019, scanned at 40x magnification (0.25 micron/pixel). Board-certified gastrointestinal pathologists categorized the WSIs into four activity classes: inactive, mildly active, moderately active, and severely active. A transformer-based model was developed and validated using five-fold cross-validation to classify IBD activity. Using HoVerNet, we examined neutrophil distribution across activity grades. Attention maps from our model highlighted areas contributing to its prediction. The model classified IBD activity with weighted averages of 0.871 [95% Confidence Interval (CI): 0.860-0.883] for the area under the curve, 0.695 [95% CI: 0.674-0.715] for precision, 0.697 [95% CI: 0.678-0.716] for recall, and 0.695 [95% CI: 0.674-0.714] for F1-score. Neutrophil distribution was significantly different across activity classes. Qualitative evaluation of attention maps by a gastrointestinal pathologist suggested their potential for improved interpretability. Our model demonstrates robust diagnostic performance and could enhance consistency and efficiency in IBD activity assessment.

翻译：使用标准化的组织病理学评分系统对炎症性肠病（IBD）活动度进行分级，由于资源限制和观察者间差异，仍然具有挑战性。在本研究中，我们开发了一种深度学习模型，用于对IBD患者的苏木精-伊红染色全切片图像（WSIs）中的活动度等级进行分类，为普通病理学家提供了一种稳健的方法。我们使用了2018年和2019年在达特茅斯-希区柯克医疗中心接受治疗的636名患者的2,077张WSIs，扫描倍率为40倍（0.25微米/像素）。经委员会认证的胃肠道病理学家将WSIs分为四个活动度等级：非活动性、轻度活动性、中度活动性和重度活动性。我们开发并验证了一个基于Transformer的模型，使用五折交叉验证对IBD活动度进行分类。利用HoVerNet，我们检查了不同活动度等级间中性粒细胞的分布情况。我们模型的注意力图突出了对其预测有贡献的区域。该模型对IBD活动度进行分类的性能指标加权平均值如下：曲线下面积为0.871 [95%置信区间（CI）：0.860-0.883]，精确率为0.695 [95% CI：0.674-0.715]，召回率为0.697 [95% CI：0.678-0.716]，F1分数为0.695 [95% CI：0.674-0.714]。不同活动度等级间中性粒细胞分布存在显著差异。胃肠道病理学家对注意力图的定性评估表明其具有提高模型可解释性的潜力。我们的模型展现了稳健的诊断性能，并可能提高IBD活动度评估的一致性和效率。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/