Magnetic resonance imaging~(MRI) have played a crucial role in brain disease diagnosis, with which a range of computer-aided artificial intelligence methods have been proposed. However, the early explorations usually focus on the limited types of brain diseases in one study and train the model on the data in a small scale, yielding the bottleneck of generalization. Towards a more effective and scalable paradigm, we propose a hierarchical knowledge-enhanced pre-training framework for the universal brain MRI diagnosis, termed as UniBrain. Specifically, UniBrain leverages a large-scale dataset of 24,770 imaging-report pairs from routine diagnostics. Different from previous pre-training techniques for the unitary vision or textual feature, or with the brute-force alignment between vision and language information, we leverage the unique characteristic of report information in different granularity to build a hierarchical alignment mechanism, which strengthens the efficiency in feature learning. Our UniBrain is validated on three real world datasets with severe class imbalance and the public BraTS2019 dataset. It not only consistently outperforms all state-of-the-art diagnostic methods by a large margin and provides a superior grounding performance but also shows comparable performance compared to expert radiologists on certain disease types.
翻译:摘要:磁共振成像(MRI)在脑部疾病诊断中发挥着关键作用,为此研究人员提出了多种计算机辅助人工智能方法。然而,早期研究通常局限于单一研究中有限的脑部疾病类型,并在小规模数据上训练模型,导致泛化能力存在瓶颈。为实现更高效且可扩展的范式,我们提出了一种面向通用脑部MRI诊断的分层知识增强预训练框架,命名为UniBrain。具体而言,UniBrain利用来自常规诊断的24,770个影像-报告对的大规模数据集。不同于以往针对单一视觉或文本特征的预训练技术,或采用视觉与语言信息间暴力对齐的方法,我们利用不同粒度报告信息的独特特性构建分层对齐机制,从而增强特征学习效率。我们的UniBrain在三个存在严重类别不平衡的真实世界数据集以及公开的BraTS2019数据集上进行了验证。它不仅在大多数指标上显著优于所有现有最先进的诊断方法,展现出卓越的定位性能,更在特定疾病类型上达到了与放射科专家相媲美的诊断水平。