Vision-language foundation models have shown great promise in computational pathology but remain primarily data-driven, lacking explicit integration of medical knowledge. We introduce KEEP (KnowledgE-Enhanced Pathology), a foundation model that systematically incorporates disease knowledge into pretraining for cancer diagnosis. KEEP leverages a comprehensive disease knowledge graph encompassing 11,454 diseases and 139,143 attributes to reorganize millions of pathology image-text pairs into 143,000 semantically structured groups aligned with disease ontology hierarchies. This knowledge-enhanced pretraining aligns visual and textual representations within hierarchical semantic spaces, enabling deeper understanding of disease relationships and morphological patterns. Across 18 public benchmarks (over 14,000 whole-slide images) and 4 institutional rare cancer datasets (926 cases), KEEP consistently outperformed existing foundation models, showing substantial gains for rare subtypes. These results establish knowledge-enhanced vision-language modeling as a powerful paradigm for advancing computational pathology.
翻译:视觉-语言基础模型在计算病理学中展现出巨大潜力,但目前仍主要依赖数据驱动,缺乏对医学知识的显式整合。我们提出了KEEP(知识增强的病理学模型),这是一种在预训练中系统性地融入疾病知识以用于癌症诊断的基础模型。KEEP利用一个涵盖11,454种疾病和139,143个属性的全面疾病知识图谱,将数百万个病理图像-文本对重组为143,000个与疾病本体层次结构对齐的语义结构化组。这种知识增强的预训练在层次化语义空间内对齐视觉与文本表示,从而实现对疾病关系和形态学模式的更深入理解。在18个公共基准数据集(超过14,000张全切片图像)和4个机构罕见癌症数据集(926个病例)上,KEEP均持续优于现有基础模型,在罕见亚型上显示出显著提升。这些结果表明,知识增强的视觉-语言建模是推动计算病理学发展的强大范式。