Generalist pathology foundation models (PFMs), pretrained on large-scale multi-organ datasets, have demonstrated remarkable predictive capabilities across diverse clinical applications. However, their proficiency on the full spectrum of clinically essential tasks within a specific organ system remains an open question due to the lack of large-scale validation cohorts for a single organ as well as the absence of a tailored training paradigm that can effectively translate broad histomorphological knowledge into the organ-specific expertise required for specialist-level interpretation. In this study, we propose BRIGHT, the first PFM specifically designed for breast pathology, trained on approximately 210 million histopathology tiles from over 51,000 breast whole-slide images derived from a cohort of over 40,000 patients across 19 hospitals. BRIGHT employs a collaborative generalist-specialist framework to capture both universal and organ-specific features. To comprehensively evaluate the performance of PFMs on breast oncology, we curate the largest multi-institutional cohorts to date for downstream task development and evaluation, comprising over 25,000 WSIs across 10 hospitals. The validation cohorts cover the full spectrum of breast pathology across 24 distinct clinical tasks spanning diagnosis, biomarker prediction, treatment response and survival prediction. Extensive experiments demonstrate that BRIGHT outperforms three leading generalist PFMs, achieving state-of-the-art (SOTA) performance in 21 of 24 internal validation tasks and in 5 of 10 external validation tasks with excellent heatmap interpretability. By evaluating on large-scale validation cohorts, this study not only demonstrates BRIGHT's clinical utility in breast oncology but also validates a collaborative generalist-specialist paradigm, providing a scalable template for developing PFMs on a specific organ system.
翻译:通用病理学基础模型(PFMs)在大规模多器官数据集上进行预训练,已在多种临床应用中展现出卓越的预测能力。然而,由于缺乏针对单一器官的大规模验证队列,以及缺少一种能够将广泛的组织形态学知识有效转化为专科级解读所需的器官特异性专业知识的定制化训练范式,其在特定器官系统内全部临床关键任务上的熟练程度仍是一个悬而未决的问题。在本研究中,我们提出了BRIGHT,这是首个专门为乳腺病理学设计的PFM,其训练数据来自19家医院超过40,000名患者的超过51,000张乳腺全切片图像中的约2.1亿个组织病理学图块。BRIGHT采用了一种协作式通用-专科框架,以同时捕获通用特征和器官特异性特征。为了全面评估PFMs在乳腺肿瘤学上的性能,我们构建了迄今为止最大的多机构下游任务开发与评估队列,包含来自10家医院的超过25,000张WSI。验证队列涵盖了乳腺病理学的全谱系,包含诊断、生物标志物预测、治疗反应和生存预测在内的24项不同的临床任务。大量实验表明,BRIGHT在24项内部验证任务中的21项以及10项外部验证任务中的5项上均优于三种领先的通用PFM,达到了最先进的性能,并具有出色的热图可解释性。通过在大规模验证队列上进行评估,本研究不仅证明了BRIGHT在乳腺肿瘤学中的临床实用性,也验证了协作式通用-专科范式,为在特定器官系统上开发PFMs提供了一个可扩展的模板。