A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

Modern medicine generates vast multimodal data across siloed systems, yet no existing model integrates the full breadth and temporal depth of the clinical record into a unified patient representation. We introduce Apollo, a multimodal temporal foundation model trained and evaluated on over three decades of longitudinal hospital records from a major US hospital system, composed of 25 billion records from 7.2 million patients, representing 28 distinct medical modalities and 12 major medical specialties. Apollo learns a unified representation space integrating over 100 thousand unique medical events in our clinical vocabulary as well as images and clinical text. This "atlas of medical concepts" forms a computational substrate for modeling entire patient care journeys comprised of sequences of structured and unstructured events, which are compressed by Apollo into virtual patient representations. To assess the potential of these whole-patient representations, we created 322 prognosis and retrieval tasks from a held-out test set of 1.4 million patients. We demonstrate the generalized clinical forecasting potential of Apollo embeddings, including predicting new disease onset risk up to five years in advance (95 tasks), disease progression (78 tasks), treatment response (59 tasks), risk of treatment-related adverse events (17 tasks), and hospital operations endpoints (12 tasks). Using feature attribution techniques, we show that model predictions align with clinically-interpretable multimodal biomarkers. We evaluate semantic similarity search on 61 retrieval tasks, and moreover demonstrate the potential of Apollo as a multimodal medical search engine using text and image queries. Together, these modeling capabilities establish the foundation for computable medicine, where the full context of patient care becomes accessible to computational reasoning.

翻译：现代医学在分散的系统中生成海量多模态数据，然而现有模型均无法将临床记录的完整广度与时间深度整合为统一的患者表征。我们提出Apollo——一个基于美国某大型医疗系统三十余年纵向住院记录（涵盖720万患者、250亿条记录、28种医学模态及12个主要医学专科）训练和评估的多模态时间序列基础模型。Apollo学习构建一个统一表征空间，整合临床词汇中超过10万种独特医学事件，以及医学影像与临床文本。该"医学概念图谱"构成了计算基底，用于建模由结构化与非结构化事件序列组成的完整患者诊疗历程，并被Apollo压缩为虚拟患者表征。为评估这些全人表征的潜力，我们从140万患者的保留测试集中构建了322项预后预测与检索任务。我们证明了Apollo嵌入的通用临床预测能力，包括：提前五年预测新发病风险（95项任务）、疾病进展预测（78项任务）、治疗反应预测（59项任务）、治疗相关不良事件风险预测（17项任务）及医院运营终点预测（12项任务）。通过特征归因技术，我们证明模型预测与临床可解释的多模态生物标志物一致。此外，我们在61项检索任务上评估了语义相似性搜索，并展示了Apollo作为多模态医学搜索引擎（支持文本与图像查询）的潜力。这些建模能力共同为可计算医学奠定基础，使患者诊疗的完整上下文可供计算推理访问。