Machine learning (ML) has recently shown promising results in medical predictions using electronic health records (EHRs). However, since ML models typically have a limited capability in terms of input sizes, selecting specific medical events from EHRs for use as input is necessary. This selection process, often relying on expert opinion, can cause bottlenecks in development. We propose Retrieval-Enhanced Medical prediction model (REMed) to address such challenges. REMed can essentially evaluate unlimited medical events, select the relevant ones, and make predictions. This allows for an unrestricted input size, eliminating the need for manual event selection. We verified these properties through experiments involving 27 clinical prediction tasks across four independent cohorts, where REMed outperformed the baselines. Notably, we found that the preferences of REMed align closely with those of medical experts. We expect our approach to significantly expedite the development of EHR prediction models by minimizing clinicians' need for manual involvement.
翻译:机器学习(ML)近期在使用电子健康记录(EHR)进行医疗预测方面展现出有前景的结果。然而,由于ML模型通常在输入规模方面能力有限,有必要从EHR中选择特定的医疗事件作为输入。这一选择过程通常依赖专家意见,可能导致开发瓶颈。我们提出检索增强型医疗预测模型(REMed)以应对此类挑战。REMed本质上能够评估无限量的医疗事件,选择相关事件并进行预测。这实现了不受限制的输入规模,无需人工事件选择。我们通过在四个独立队列中涉及27项临床预测任务的实验验证了这些特性,其中REMed的表现优于基线模型。值得注意的是,我们发现REMed的偏好与医疗专家的偏好高度一致。我们预期该方法将通过最小化临床医生的人工参与需求,显著加速EHR预测模型的开发。