Electronic health records (EHRs) contain valuable patient data for health-related prediction tasks, such as disease prediction. Traditional approaches rely on supervised learning methods that require large labeled datasets, which can be expensive and challenging to obtain. In this study, we investigate the feasibility of applying Large Language Models (LLMs) to convert structured patient visit data (e.g., diagnoses, labs, prescriptions) into natural language narratives. We evaluate the zero-shot and few-shot performance of LLMs using various EHR-prediction-oriented prompting strategies. Furthermore, we propose a novel approach that utilizes LLM agents with different roles: a predictor agent that makes predictions and generates reasoning processes and a critic agent that analyzes incorrect predictions and provides guidance for improving the reasoning of the predictor agent. Our results demonstrate that with the proposed approach, LLMs can achieve decent few-shot performance compared to traditional supervised learning methods in EHR-based disease predictions, suggesting its potential for health-oriented applications.
翻译:电子健康记录(EHR)包含用于健康相关预测任务(如疾病预测)的宝贵患者数据。传统方法依赖需要大量标注数据集的监督学习方法,而这些数据集的获取成本高昂且充满挑战。在本研究中,我们探讨了将大语言模型(LLMs)应用于结构化患者就诊数据(如诊断、实验室检查、处方)转换为自然语言叙述的可行性。我们使用多种面向EHR预测的提示策略评估了LLMs的零样本和少样本性能。此外,我们提出了一种新方法,利用不同角色的LLM智能体:预测智能体负责生成预测及其推理过程,关键智能体则分析错误预测并提供改进预测智能体推理的指导。实验结果表明,采用所提方法后,LLMs在基于EHR的疾病预测任务中相较于传统监督学习方法可实现可观的少样本性能,这揭示了其面向健康领域应用的发展潜力。