Large Language Models (LLMs) have demonstrated significant potential in medicine, with many studies adapting them through continued pre-training or fine-tuning on medical data to enhance domain-specific accuracy and safety. However, a key open question remains: to what extent do LLMs memorize medical training data. Memorization can be beneficial when it enables LLMs to retain valuable medical knowledge during domain adaptation. Yet, it also raises concerns. LLMs may inadvertently reproduce sensitive clinical content (e.g., patient-specific details), and excessive memorization may reduce model generalizability, increasing risks of misdiagnosis and making unwarranted recommendations. These risks are further amplified by the generative nature of LLMs, which can not only surface memorized content but also produce overconfident, misleading outputs that may hinder clinical adoption. In this work, we present a study on memorization of LLMs in medicine, assessing its prevalence (how frequently it occurs), characteristics (what is memorized), volume (how much content is memorized), and potential downstream impacts (how memorization may affect medical applications). We systematically analyze common adaptation scenarios: (1) continued pretraining on medical corpora, (2) fine-tuning on standard medical benchmarks, and (3) fine-tuning on real-world clinical data, including over 13,000 unique inpatient records from Yale New Haven Health System. The results demonstrate that memorization is prevalent across all adaptation scenarios and significantly higher than that reported in the general domain. Moreover, memorization has distinct characteristics during continued pre-training and fine-tuning, and it is persistent: up to 87% of content memorized during continued pre-training remains after fine-tuning on new medical tasks.
翻译:大语言模型(LLMs)在医学领域展现出巨大潜力,众多研究通过医学数据的持续预训练或微调来提升其领域特异性准确度与安全性。然而,一个关键问题仍未解决:LLMs对医学训练数据的记忆程度究竟如何?记忆在使LLMs在领域适应过程中保留宝贵医学知识时具有积极作用,但也引发诸多隐忧。LLMs可能无意中复现敏感临床内容(如患者特定信息),且过度记忆可能降低模型泛化能力,增加误诊风险并导致无依据的医疗建议。这些风险因LLMs的生成特性而被进一步放大——模型不仅能呈现记忆内容,还可能生成过度自信且具有误导性的输出,从而阻碍临床实际应用。本研究针对医学领域LLMs的记忆现象展开系统性分析,评估其普遍性(发生频率)、特征(记忆内容类型)、体量(记忆信息规模)及潜在下游影响(记忆如何作用于医学应用)。我们系统分析了三种典型适应场景:(1)基于医学语料库的持续预训练,(2)基于标准医学基准的微调,以及(3)基于真实世界临床数据的微调(包含耶鲁纽黑文医疗系统超13,000份独立住院记录)。结果表明:记忆现象在所有适应场景中普遍存在,其程度显著高于通用领域报道值;持续预训练与微调过程中的记忆特征呈现明显差异;且记忆具有持续性——持续预训练阶段记忆的内容中,最高有87%能在后续新医学任务微调后得以保留。