Foundation Models to Unlock Real-World Evidence from Nationwide Medical Claims

Fan Ma,Yuntian Liu,Xiang Lan,Weipeng Zhou,Jun Ni,Mauro Giuffrè,Lingfei Qian,Xueqing Peng,Yujia Zhou,Ruey-Ling Weng,Huan He,Lu Li,Qingyu Chen,Andrew Loza,Laila Rasmy,Degui Zhi,Yuan Lu,Chenjie Zeng,Joshua C Denny,Lee Schwamm,Daniella Meeker,Lucila Ohno-Machado,Yong Chen,Hua Xu

Evidence derived from large-scale real-world data (RWD) is increasingly informing regulatory evaluation and healthcare decision-making. Administrative claims provide population-scale, longitudinal records of healthcare utilization, expenditure, and detailed coding of diagnoses, procedures, and medications, yet their potential as a substrate for healthcare foundation models remains largely unexplored. Here we present ReClaim, a generative transformer trained from scratch on 43.8 billion medical events from more than 200 million enrollees in the MarketScan claims data spanning 2008-2022. ReClaim models longitudinal trajectories across diagnoses, procedures, medications, and expenditure, and was scaled to 140 million, 700 million, and 1.7 billion parameters. Across over 1,000 disease-onset prediction tasks, ReClaim achieved a mean AUC of 75.6%, substantially outperforming disease-specific LightGBM (66.3%) and the transformer-based Delphi model (69.4%), with the largest gains for rare diseases. These advantages held across retrospective and prospective evaluations and in external validation on two independent datasets. Performance improved monotonically with scale, and post-training added 13.8 percentage points over pre-training alone. Beyond disease prediction, ReClaim captured financial outcomes and improved real-world evidence (RWE) analyses: for healthcare expenditure forecasting it increased explained variance from 0.28 to 0.37 relative to LightGBM, and in a target trial emulation it reduced systematic bias by 72% on average relative to Delphi. Together, these results establish administrative claims as a scalable substrate for healthcare foundation models and show that learned representations generalize across time periods and data sources, supporting disease surveillance, expenditure forecasting, and RWE generation.

翻译：从大规模真实世界数据（RWD）中获取的证据正越来越多地为监管评估和医疗决策提供依据。行政索赔数据提供了人口级别的纵向记录，涵盖医疗保健利用、支出以及诊断、程序和药物的详细编码，但其作为医疗基础模型基底的潜力至今尚未被充分挖掘。本文提出ReClaim模型，这是一个从零训练的生成式Transformer，基于MarketScan索赔数据库中2008-2022年间超过2亿参保人的438亿个医疗事件数据。ReClaim对诊断、程序、药物和支出的纵向轨迹进行建模，参数量规模分别扩展至1.4亿、7亿和17亿。在1000余项疾病发病预测任务中，ReClaim实现了75.6%的平均AUC，显著优于疾病专用LightGBM模型（66.3%）和基于Transformer的Delphi模型（69.4%），其中对罕见疾病的提升最为显著。这些优势在回顾性与前瞻性评估中均得到验证，并在两个独立数据集的外部验证中保持一致。模型性能随规模单调提升，后训练阶段相比仅预训练阶段提升了13.8个百分点。除疾病预测外，ReClaim还捕捉了财务结果并改进了真实世界证据（RWE）分析：在医疗支出预测中，其解释方差相比LightGBM从0.28提升至0.37；在目标试验模拟中，相比Delphi平均减少72%的系统性偏差。这些结果共同证明行政索赔数据可作为医疗基础模型的可扩展基底，并显示所学表征能跨时间段和数据源泛化，支持疾病监测、支出预测及RWE生成。