Artificial intelligence for abnormality detection in high volume neuroimaging: a systematic review and meta-analysis

Purpose: Most studies evaluating artificial intelligence (AI) models that detect abnormalities in neuroimaging are either tested on unrepresentative patient cohorts or are insufficiently well-validated, leading to poor generalisability to real-world tasks. The aim was to determine the diagnostic test accuracy and summarise the evidence supporting the use of AI models performing first-line, high-volume neuroimaging tasks. Methods: Medline, Embase, Cochrane library and Web of Science were searched until September 2021 for studies that temporally or externally validated AI capable of detecting abnormalities in first-line CT or MR neuroimaging. A bivariate random-effects model was used for meta-analysis where appropriate. PROSPERO: CRD42021269563. Results: Only 16 studies were eligible for inclusion. Included studies were not compromised by unrepresentative datasets or inadequate validation methodology. Direct comparison with radiologists was available in 4/16 studies. 15/16 had a high risk of bias. Meta-analysis was only suitable for intracranial haemorrhage detection in CT imaging (10/16 studies), where AI systems had a pooled sensitivity and specificity 0.90 (95% CI 0.85 - 0.94) and 0.90 (95% CI 0.83 - 0.95) respectively. Other AI studies using CT and MRI detected target conditions other than haemorrhage (2/16), or multiple target conditions (4/16). Only 3/16 studies implemented AI in clinical pathways, either for pre-read triage or as post-read discrepancy identifiers. Conclusion: The paucity of eligible studies reflects that most abnormality detection AI studies were not adequately validated in representative clinical cohorts. The few studies describing how abnormality detection AI could impact patients and clinicians did not explore the full ramifications of clinical implementation.

翻译：目的：大多数评估用于检测神经影像异常的AI模型的研究，要么在非代表性患者队列中进行测试，要么验证不充分，导致其在真实世界任务中的泛化能力较差。本研究旨在确定诊断性测试准确性，并总结支持AI模型执行一线、高容量神经影像任务的相关证据。方法：检索Medline、Embase、Cochrane图书馆及Web of Science数据库，时间截至2021年9月，筛选对能检测一线CT或MR神经影像异常的AI模型进行时间或外部验证的研究。在适当情况下，采用双变量随机效应模型进行荟萃分析。PROSPERO注册号：CRD42021269563。结果：仅16项研究符合纳入标准。纳入的研究未受非代表性数据集或不充分验证方法的影响。4/16项研究提供了与放射科医师的直接比较。15/16项研究存在高偏倚风险。仅CT影像中颅内出血检测适合进行荟萃分析（10/16项研究），AI系统的合并敏感度与特异度分别为0.90（95% CI 0.85-0.94）和0.90（95% CI 0.83-0.95）。其他使用CT和MRI的AI研究检测了出血以外的目标病症（2/16项研究），或多种目标病症（4/16项研究）。仅3/16项研究在临床路径中应用了AI，其用途包括预读分诊或后读差异识别。结论：合格研究的匮乏反映出大多数异常检测AI研究未在代表性临床队列中得到充分验证。少数描述异常检测AI如何影响患者及临床医生的研究，未能探索临床实施的全部影响。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【TPAMI2020】目标检测中的不平衡问题:综述论文，34页pdf

专知会员服务

55+阅读 · 2020年3月16日