Large Language Models (LLMs) hold significant promise for improving clinical decision support and reducing physician burnout by synthesizing complex, longitudinal cancer Electronic Health Records (EHRs). However, their implementation in this critical field faces three primary challenges: the inability to effectively process the extensive length and fragmented nature of patient records for accurate temporal analysis; a heightened risk of clinical hallucination, as conventional grounding techniques such as Retrieval-Augmented Generation (RAG) do not adequately incorporate process-oriented clinical guidelines; and unreliable evaluation metrics that hinder the validation of AI systems in oncology. To address these issues, we propose CliCARE, a framework for Grounding Large Language Models in Clinical Guidelines for Decision Support over Longitudinal Cancer Electronic Health Records. The framework operates by transforming unstructured, longitudinal EHRs into patient-specific Temporal Knowledge Graphs (TKGs) to capture long-range dependencies, and then grounding the decision support process by aligning these real-world patient trajectories with a normative guideline knowledge graph. This approach provides oncologists with evidence-grounded decision support by generating a high-fidelity clinical summary and an actionable recommendation. We validated our framework using large-scale, longitudinal data from a private Chinese cancer dataset and the public English MIMIC-IV dataset. In these settings, CliCARE significantly outperforms baselines, including leading long-context LLMs and Knowledge Graph-enhanced RAG methods. The clinical validity of our results is supported by a robust evaluation protocol, which demonstrates a high correlation with assessments made by oncologists.
翻译:大语言模型(LLMs)在整合复杂、纵向的癌症电子健康记录(EHRs)以改善临床决策支持、减轻医生职业倦怠方面具有巨大潜力。然而,其在这一关键领域的应用面临三个主要挑战:无法有效处理患者记录的巨大长度和碎片化特性以进行准确的时间分析;临床幻觉风险较高,因为传统的知识落地技术(如检索增强生成(RAG))未能充分纳入面向过程的临床指南;以及不可靠的评估指标阻碍了人工智能系统在肿瘤学领域的验证。为解决这些问题,我们提出了CliCARE框架,旨在将大语言模型基于临床指南落地,用于纵向癌症电子健康记录的决策支持。该框架通过将非结构化的纵向EHRs转化为患者特定的时序知识图谱(TKGs)以捕捉长程依赖关系,并通过将真实世界的患者病程轨迹与规范化的指南知识图谱对齐,从而实现决策支持过程的落地。该方法通过生成高保真的临床总结和可执行的建议,为肿瘤科医生提供基于证据的决策支持。我们使用来自一个中国私有癌症数据集和公开的英文MIMIC-IV数据集的大规模纵向数据验证了我们的框架。在这些场景下,CliCARE显著优于基线方法,包括领先的长上下文LLMs和知识图谱增强的RAG方法。我们结果的临床有效性得到了一个稳健评估方案的支持,该方案显示其与肿瘤科医生的评估具有高度相关性。