Clinical decision-making depends on expert reasoning, which is guided by standardized, evidence-based guidelines. However, translating these guidelines into automated clinical decision support systems risks inaccuracy and importantly, loss of nuance. We share an application architecture, the Large Language Expert (LLE), that combines the flexibility and power of Large Language Models (LLMs) with the interpretability, explainability, and reliability of Expert Systems. LLMs help address key challenges of Expert Systems, such as integrating and codifying knowledge, and data normalization. Conversely, an Expert System-like approach helps overcome challenges with LLMs, including hallucinations, atomic and inexpensive updates, and testability. To highlight the power of the Large Language Expert (LLE) system, we built an LLE to assist with the workup of patients newly diagnosed with cancer. Timely initiation of cancer treatment is critical for optimal patient outcomes. However, increasing complexity in diagnostic recommendations has made it difficult for primary care physicians to ensure their patients have completed the necessary workup before their first visit with an oncologist. As with many real-world clinical tasks, these workups require the analysis of unstructured health records and the application of nuanced clinical decision logic. In this study, we describe the design & evaluation of an LLE system built to rapidly identify and suggest the correct diagnostic workup. The system demonstrated a high degree of clinical-level accuracy (>95%) and effectively addressed gaps identified in real-world data from breast and colon cancer patients at a large academic center.
翻译:临床决策依赖于专家推理,这种推理遵循标准化、循证化的指南。然而,将这些指南转化为自动化的临床决策支持系统存在不准确的风险,更重要的是,可能导致细微差别的丧失。我们分享一种应用架构——大型语言专家(LLE),它结合了大型语言模型(LLMs)的灵活性和强大能力,以及专家系统的可解释性、可说明性和可靠性。LLMs有助于应对专家系统的关键挑战,例如知识的整合与编码以及数据规范化。相反,类似专家系统的方法有助于克服LLMs面临的挑战,包括幻觉问题、原子化且低成本的更新以及可测试性。为了突显大型语言专家(LLE)系统的能力,我们构建了一个LLE来协助新诊断癌症患者的检查流程。及时启动癌症治疗对于患者获得最佳预后至关重要。然而,诊断建议日益复杂,使得初级保健医生难以确保患者在首次就诊肿瘤科医生前已完成必要的检查。与许多现实世界的临床任务一样,这些检查需要分析非结构化的健康记录并应用细致的临床决策逻辑。在本研究中,我们描述了为快速识别并建议正确诊断检查而构建的LLE系统的设计与评估。该系统展现出高度的临床级准确性(>95%),并有效解决了从一家大型学术中心的乳腺癌和结肠癌患者真实世界数据中识别出的不足。