Deployment and Evaluation of an EHR-integrated, Large Language Model-Powered Tool to Triage Surgical Patients

Jane Wang,Timothy Keyes,April S Liang,Stephen P Ma,Jason Shen,Jerry Liu,Nerissa Ambers,Abby Pandya,Rita Pandya,Jason Hom,Natasha Steele,Jonathan H Chen,Kevin Schulman

from arxiv, 35 pages, 4 figures, 5 tables

Surgical co-management (SCM) is an evidence-based model in which hospitalists jointly manage medically complex perioperative patients alongside surgical teams. Despite its clinical and financial value, SCM is limited by the need to manually identify eligible patients. To determine whether SCM triage can be automated, we conducted a prospective, unblinded study at Stanford Health Care in which an LLM-based, electronic health record (EHR)-integrated triage tool (SCM Navigator) provided SCM recommendations followed by physician review. Using pre-operative documentation, structured data, and clinical criteria for perioperative morbidity, SCM Navigator categorized patients as appropriate, not appropriate, or possibly appropriate for SCM. Faculty indicated their clinical judgment and provided free-text feedback when they disagreed. Sensitivity, specificity, positive predictive value, and negative predictive value were measured using physician determinations as a reference. Free-text reasons were thematically categorized, and manual chart review was conducted on all false-negative cases and 30 randomly selected cases from the largest false-positive category. Since deployment, 6,193 cases have been triaged, of which 1,582 (23%) were recommended for hospitalist consultation. SCM Navigator displayed high sensitivity (0.94, 95% CI 0.91-0.96) and moderate specificity (0.74, 95% CI 0.71-0.77). Post-hoc chart review suggested most discrepancies reflect modifiable gaps in clinical criteria, institutional workflow, or physician practice variability rather than LLM misclassification, which accounted for 2 of 19 (11%) false-negative cases. These findings demonstrate that an LLM-powered, EHR-integrated, human-in-the-loop AI system can accurately and safely triage surgical patients for SCM, and that AI-enabled screening tools can augment and potentially automate time-intensive clinical workflows.

翻译：外科协同管理（SCM）是一种循证模式，由住院医师与外科团队共同管理围手术期医学复杂患者。尽管具有临床和经济价值，但SCM受限于需要人工识别符合条件的患者。为确定能否实现SCM分诊自动化，我们在斯坦福医疗中心开展了一项前瞻性、非盲研究，采用基于大型语言模型（LLM）、集成电子病历（EHR）的分诊工具（SCM导航器），在医师审核前提供SCM推荐建议。该工具利用术前文档、结构化数据及围手术期并发症临床标准，将患者分为适合、不适合或可能适合SCM三类。当医师与工具意见不一致时，需记录临床判断并提供自由文本反馈。以医师判定为参考标准，测量灵敏度、特异度、阳性预测值及阴性预测值。对自由文本反馈进行主题分类，并对所有假阴性病例及最大假阳性类别中随机选取的30例病例进行手动病历审查。自部署以来，共完成6,193例病例分诊，其中1,582例（23%）被推荐接受住院医师会诊。SCM导航器展现出高灵敏度（0.94，95% CI 0.91-0.96）和中等特异度（0.74，95% CI 0.71-0.77）。事后病历审查表明，多数差异源于临床标准可修改的缺陷、机构工作流程或医师实践差异，而非LLM分类错误（在19例假阴性病例中占2例，即11%）。这些结果表明，集成EHR的LLM驱动型人机回环AI系统能够准确安全地对外科患者进行SCM分诊，且AI辅助筛查工具可增强并最终实现耗时临床工作流程的自动化。