Developing clinical prediction models (e.g., mortality prediction) based on electronic health records (EHRs) typically relies on expert opinion for feature selection and adjusting observation window size. This burdens experts and creates a bottleneck in the development process. We propose Retrieval-Enhanced Medical prediction model (REMed) to address such challenges. REMed can essentially evaluate an unlimited number of clinical events, select the relevant ones, and make predictions. This approach effectively eliminates the need for manual feature selection and enables an unrestricted observation window. We verified these properties through experiments on 27 clinical tasks and two independent cohorts from publicly available EHR datasets, where REMed outperformed other contemporary architectures that aim to handle as many events as possible. Notably, we found that the preferences of REMed align closely with those of medical experts. We expect our approach to significantly expedite the development of EHR prediction models by minimizing clinicians' need for manual involvement.
翻译:基于电子健康记录(EHR)开发临床预测模型(如死亡率预测)通常依赖专家经验进行特征选择和调整观测窗口大小。这种做法加重了专家负担,并成为开发流程中的瓶颈。为解决这一挑战,我们提出了检索增强型医疗预测模型(REMed)。REMed本质上能够评估无数量限制的临床事件,从中筛选相关事件并做出预测。该方法有效消除了手动特征选择的需求,并实现了无界观测窗口。我们通过公开EHR数据集中的27项临床任务和两个独立队列验证了这些特性,结果表明REMed在性能上优于其他旨在处理尽可能多事件的当代架构。值得注意的是,我们发现REMed的偏好与医学专家高度一致。我们预期该方法将通过最大限度减少临床医生的人工参与,显著加速EHR预测模型的开发进程。