We present Clinical Prediction with Large Language Models (CPLLM), a method that involves fine-tuning a pre-trained Large Language Model (LLM) for clinical disease prediction. We utilized quantization and fine-tuned the LLM using prompts, with the task of predicting whether patients will be diagnosed with a target disease during their next visit or in the subsequent diagnosis, leveraging their historical diagnosis records. We compared our results versus various baselines, including Logistic Regression, RETAIN, and Med-BERT, which is the current state-of-the-art model for disease prediction using structured EHR data. Our experiments have shown that CPLLM surpasses all the tested models in terms of both PR-AUC and ROC-AUC metrics, displaying noteworthy enhancements compared to the baseline models.
翻译:我们提出了一种基于大型语言模型的临床预测方法(CPLLM),该方法通过对预训练的大型语言模型进行微调来实现临床疾病预测。我们利用量化技术,并通过提示(prompts)对语言模型进行微调,任务是基于患者的历史诊断记录,预测其在下次就诊或后续诊断中是否会被诊断出目标疾病。我们将结果与多种基线模型进行比较,包括逻辑回归、RETAIN以及Med-BERT(当前使用结构化电子健康记录数据进行疾病预测的最先进模型)。实验表明,CPLLM在PR-AUC和ROC-AUC指标上均超越了所有测试模型,相较于基线模型表现出显著的性能提升。