Many diagnostic errors occur because clinicians cannot easily access relevant information in patient Electronic Health Records (EHRs). In this work we propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses; our ultimate aim is to increase access to evidence and reduce diagnostic errors. In particular, we propose a Neural Additive Model to make predictions backed by evidence with individualized risk estimates at time-points where clinicians are still uncertain, aiming to specifically mitigate delays in diagnosis and errors stemming from an incomplete differential. To train such a model, it is necessary to infer temporally fine-grained retrospective labels of eventual "true" diagnoses. We do so with LLMs, to ensure that the input text is from before a confident diagnosis can be made. We use an LLM to retrieve an initial pool of evidence, but then refine this set of evidence according to correlations learned by the model. We conduct an in-depth evaluation of the usefulness of our approach by simulating how it might be used by a clinician to decide between a pre-defined list of differential diagnoses.
翻译:许多诊断错误的发生是由于临床医生无法轻松获取患者电子健康记录(EHR)中的相关信息。在这项工作中,我们提出了一种利用大语言模型(LLM)识别患者EHR数据中表明特定诊断风险升高或降低的证据片段的方法;我们的最终目标是增加对证据的获取,并减少诊断错误。具体而言,我们提出了一种神经加性模型,在临床医生仍不确定的时间点,基于证据进行预测,并提供个体化的风险估计,旨在专门缓解因不完整鉴别诊断而导致的诊断延迟和错误。为训练此类模型,需要推断出最终“真实”诊断的时间细粒度回顾性标签。我们借助大语言模型实现这一目标,确保输入文本来自能够做出确信诊断之前的阶段。我们使用大语言模型检索初始证据池,但随后根据模型学习到的相关性对该证据集进行提炼。我们通过模拟临床医生如何利用该方法在预定义的鉴别诊断列表中进行决策,深入评估了该方法的实用性。