Current gastric cancer (GCa) risk systems are prone to errors since they evaluate a visual estimation of intestinal metaplasia percentages in histopathology images of gastric mucosa to assign a risk. This study presents an automated method to detect and quantify intestinal metaplasia using deep convolutional neural networks as well as a comparative analysis with visual estimations of three pathologists. Gastric samples were collected from two different cohorts: 149 asymptomatic volunteers from a region with a high prevalence of GCa in Colombia and 56 patients from a tertiary hospital. Deep learning models were trained to classify intestinal metaplasia, and predictions were used to estimate a percentage of intestinal metaplasia and to assign an adapted OLGIM stage. Atrophy was not assessed because of the limited reproducibility among pathologists. Results were compared with independent blinded metaplastic assessments performed by three graduated pathologists. The best-performing deep learning architecture classified intestinal metaplasia with F1-Score of 0.80 +- 0.01 and AUC of 0.91 +- 0.01. Among pathologists, inter-observer agreement by a Fleiss's Kappa score ranged from 0.20 to 0.48. In comparison, agreement between the pathologists and the best-performing model ranged from 0.12 to 0.35. Deep learning models show potential to reliably detect and quantify the percentage of intestinal metaplasia, achieving high classification performance. In practice, visual estimation is still the only available method, yet it is marked by considerable inter-observer variability. Deep learning models provide consistent estimates that could help reduce this subjectivity in risk stratification.


翻译:当前胃癌风险分层系统存在误差,因其依赖于对胃黏膜组织病理学图像中肠上皮化生百分比的目视评估来划分风险等级。本研究提出一种利用深度卷积神经网络自动检测和量化肠上皮化生的方法,并与三位病理学家的目视评估结果进行对比分析。胃组织样本来自两个不同队列:哥伦比亚胃癌高发地区的149名无症状志愿者,以及一家三级医院的56名患者。研究训练深度学习模型对肠上皮化生进行分类,并利用预测结果估算肠上皮化生百分比及分配经调整的OLGIM分期。由于病理学家间评估萎缩的可重复性较低,本研究未对萎缩进行评估。模型结果与三位资深病理学家独立进行的盲法肠上皮化生评估结果进行比较。性能最优的深度学习架构分类肠上皮化生的F1分数为0.80 ± 0.01,AUC为0.91 ± 0.01。病理学家间的观察者一致性(Fleiss's Kappa评分)介于0.20至0.48之间。相比之下,病理学家与最优模型间的一致性评分范围为0.12至0.35。深度学习模型展现出可靠检测和量化肠上皮化生百分比的潜力,并实现了较高的分类性能。实践中,目视评估仍是唯一可用方法,但其存在显著的观察者间差异性。深度学习模型能提供一致性评估,有助于降低风险分层中的主观性。

0
下载
关闭预览

相关内容

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用
专知会员服务
41+阅读 · 2019年10月9日
灾难性遗忘问题新视角:迁移-干扰平衡
CreateAMind
17+阅读 · 2019年7月6日
meta learning 17年:MAML SNAIL
CreateAMind
11+阅读 · 2019年1月2日
disentangled-representation-papers
CreateAMind
26+阅读 · 2018年9月12日
国家自然科学基金
0+阅读 · 2014年12月31日
VIP会员
相关资讯
灾难性遗忘问题新视角:迁移-干扰平衡
CreateAMind
17+阅读 · 2019年7月6日
meta learning 17年:MAML SNAIL
CreateAMind
11+阅读 · 2019年1月2日
disentangled-representation-papers
CreateAMind
26+阅读 · 2018年9月12日
Top
微信扫码咨询专知VIP会员