AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow

Huizi Yu,Jiayan Zhou,Lingyao Li,Shan Chen,Jack Gallifant,Anye Shi,Xiang Li,Wenyue Hua,Mingyu Jin,Guang Chen,Yang Zhou,Zhao Li,Trisha Gupte,Ming-Li Chen,Zahra Azizi,Yongfeng Zhang,Themistocles L. Assimes,Xin Ma,Danielle S. Bitterman,Lin Lu,Lizhou Fan

from arxiv, 42 pages, 6 figures, 7 tables

Simulated patient systems play a crucial role in modern medical education and research, providing safe, integrative learning environments and enabling clinical decision-making simulations. Large Language Models (LLM) could advance simulated patient systems by replicating medical conditions and patient-doctor interactions with high fidelity and low cost. However, ensuring the effectiveness and trustworthiness of these systems remains a challenge, as they require a large, diverse, and precise patient knowledgebase, along with a robust and stable knowledge diffusion to users. Here, we developed AIPatient, an advanced simulated patient system with AIPatient Knowledge Graph (AIPatient KG) as the input and the Reasoning Retrieval-Augmented Generation (Reasoning RAG) agentic workflow as the generation backbone. AIPatient KG samples data from Electronic Health Records (EHRs) in the Medical Information Mart for Intensive Care (MIMIC)-III database, producing a clinically diverse and relevant cohort of 1,495 patients with high knowledgebase validity (F1 0.89). Reasoning RAG leverages six LLM powered agents spanning tasks including retrieval, KG query generation, abstraction, checker, rewrite, and summarization. This agentic framework reaches an overall accuracy of 94.15% in EHR-based medical Question Answering (QA), outperforming benchmarks that use either no agent or only partial agent integration. Our system also presents high readability (median Flesch Reading Ease 77.23; median Flesch Kincaid Grade 5.6), robustness (ANOVA F-value 0.6126, p>0.1), and stability (ANOVA F-value 0.782, p>0.1). The promising performance of the AIPatient system highlights its potential to support a wide range of applications, including medical education, model evaluation, and system integration.

翻译：模拟患者系统在现代医学教育与研究中发挥着至关重要的作用，其提供了安全、综合的学习环境，并支持临床决策模拟。大型语言模型（LLM）能够以高保真度和低成本复现医疗状况与医患互动，从而推动模拟患者系统的发展。然而，确保此类系统的有效性与可信度仍面临挑战，因为它们需要一个庞大、多样且精确的患者知识库，以及一个稳健、稳定的知识扩散机制以面向用户。为此，我们开发了AIPatient，这是一个先进的模拟患者系统，以AIPatient知识图谱（AIPatient KG）作为输入，并以推理检索增强生成（Reasoning RAG）智能体工作流作为生成核心。AIPatient KG从重症监护医学信息集市（MIMIC）-III数据库的电子健康记录（EHRs）中采样数据，构建了一个包含1,495名患者的临床多样且相关的队列，其知识库效度较高（F1分数0.89）。Reasoning RAG利用了六个LLM驱动的智能体，涵盖检索、KG查询生成、抽象、检查、重写和总结等任务。该智能体框架在基于EHR的医学问答（QA）任务中达到了94.15%的整体准确率，优于未使用智能体或仅部分集成智能体的基准方法。我们的系统还表现出高可读性（中位数Flesch阅读易度77.23；中位数Flesch-Kincaid年级水平5.6）、强鲁棒性（ANOVA F值0.6126，p>0.1）和高稳定性（ANOVA F值0.782，p>0.1）。AIPatient系统所展现的优异性能凸显了其在支持医学教育、模型评估和系统集成等广泛应用的潜力。