Dementia-R1：基于强化预训练与非结构化临床记录推理的真实世界痴呆症预后预测 (Dementia-R1: Reinforced Pretraining and Reasoning from Unstructured Clinical Notes for Real-World Dementia Prognosis)

While Large Language Models (LLMs) have shown strong performance on clinical text understanding, they struggle with longitudinal prediction tasks such as dementia prognosis, which require reasoning over complex, non-monotonic symptom trajectories across multiple visits. Standard supervised training lacks explicit annotations for symptom evolution, while direct Reinforcement Learning (RL) is hindered by sparse binary rewards. To address this challenge, we introduce Dementia-R1, an RL-based framework for longitudinal dementia prognosis from unstructured clinical notes. Our approach adopts a Cold-Start RL strategy that pre-trains the model to predict verifiable clinical indices extracted from patient histories, enhancing the capability to reason about disease progression before determining the final clinical status. Extensive experiments demonstrate that Dementia-R1 achieves an F1 score of 77.03% on real-world unstructured clinical datasets. Notably, on the ADNI benchmark, our 7B model rivals GPT-4o, effectively capturing fluctuating cognitive trajectories. Code is available at https://anonymous.4open.science/r/dementiar1-CDB5

翻译：尽管大型语言模型（LLM）在临床文本理解方面表现出色，但在痴呆症预后等纵向预测任务中仍面临挑战。这类任务需要对多次就诊中复杂且非单调的症状轨迹进行推理。标准监督训练缺乏症状演变的显式标注，而直接强化学习（RL）则受限于稀疏的二元奖励机制。为解决这一难题，我们提出了Dementia-R1——一个基于强化学习的框架，用于从非结构化临床记录中进行纵向痴呆症预后预测。本方法采用冷启动强化学习策略，通过预训练模型预测从患者历史记录中提取的可验证临床指标，从而在确定最终临床状态前增强对疾病进展的推理能力。大量实验表明，Dementia-R1在真实世界非结构化临床数据集上取得了77.03%的F1分数。值得注意的是，在ADNI基准测试中，我们的7B模型性能与GPT-4o相当，能有效捕捉波动的认知轨迹。代码已发布于https://anonymous.4open.science/r/dementiar1-CDB5