A comprehensive multimodal dataset and benchmark for ulcerative colitis scoring in endoscopy

Noha Ghatwary,Jiangbei Yue,Ahmed Elgendy,Hanna Nagdy,Ahmed Galal,Hayam Fathy,Hussein El-Amin,Venkataraman Subramanian,Noor Mohammed,Gilberto Ochoa-Ruiz,Sharib Ali

from arxiv, 11

Ulcerative colitis (UC) is a chronic mucosal inflammatory condition that places patients at increased risk of colorectal cancer. Colonoscopic surveillance remains the gold standard for assessing disease activity, and reporting typically relies on standardised endoscopic scoring metrics. The most widely used is the Mayo Endoscopic Score (MES), with some centres also adopting the Ulcerative Colitis Endoscopic Index of Severity (UCEIS). Both are descriptive assessments of mucosal inflammation (MES: 0 to 3; UCEIS: 0 to 8), where higher values indicate more severe disease. However, computational methods for automatically predicting these scores remain limited, largely due to the lack of publicly available expert-annotated datasets and the absence of robust benchmarking. There is also a significant research gap in generating clinically meaningful descriptions of UC images, despite image captioning being a well-established computer vision task. Variability in endoscopic systems and procedural workflows across centres further highlights the need for multi-centre datasets to ensure algorithmic robustness and generalisability. In this work, we introduce a curated multi-centre, multi-resolution dataset that includes expert-validated MES and UCEIS labels, alongside detailed clinical descriptions. To our knowledge, this is the first comprehensive dataset that combines dual scoring metrics for classification tasks with expert-generated captions describing mucosal appearance and clinically accepted reasoning for image captioning. This resource opens new opportunities for developing clinically meaningful multimodal algorithms. In addition to the dataset, we also provide benchmarking using convolutional neural networks, vision transformers, hybrid models, and widely used multimodal vision-language captioning algorithms.

翻译：溃疡性结肠炎（UC）是一种慢性黏膜炎症性疾病，会增加患者罹患结直肠癌的风险。结肠镜监测仍是评估疾病活动度的金标准，其报告通常依赖于标准化的内窥镜评分指标。应用最广泛的是梅奥内镜评分（MES），部分中心也采用溃疡性结肠炎内镜下严重程度指数（UCEIS）。两者均为对黏膜炎症的描述性评估（MES：0至3分；UCEIS：0至8分），分值越高表明疾病越严重。然而，自动预测这些评分的计算方法仍然有限，这主要由于缺乏公开可用的专家标注数据集以及缺少稳健的基准测试。尽管图像描述生成是计算机视觉领域一项成熟的任务，但在生成具有临床意义的UC图像描述方面仍存在显著的研究空白。不同中心内窥镜系统与操作流程的差异性进一步凸显了对多中心数据集的需求，以确保算法的稳健性与泛化能力。本研究引入了一个精心构建的多中心、多分辨率数据集，包含经专家验证的MES和UCEIS标签以及详细的临床描述。据我们所知，这是首个将用于分类任务的双重评分指标与专家生成的、描述黏膜外观及临床公认推理的图像描述相结合的综合性数据集。该资源为开发具有临床意义的多模态算法提供了新的机遇。除数据集外，我们还提供了基于卷积神经网络、视觉Transformer、混合模型以及广泛使用的多模态视觉-语言描述生成算法的基准测试结果。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

《基于回归估计、附带损伤优化与发射集成的武器效能评估（RECOIL）系统》

专知会员服务

18+阅读 · 2025年11月22日

《军事中的大数据：将基因组学纳入标准护理检测和治疗流程》最新68页报告

专知会员服务

18+阅读 · 2024年7月9日

《用于视网膜形态学和功能评估的多模态-多功能图像融合》美空军24页报告

专知会员服务

26+阅读 · 2023年4月2日

Cancer Cell综述｜AI用于肿瘤学中的多模态数据集成

专知会员服务

35+阅读 · 2022年10月13日