从文献中学习：整合LLM与贝叶斯分层模型用于肿瘤学试验设计 (Learning from Literature: Integrating LLMs and Bayesian Hierarchical Modeling for Oncology Trial Design)

Designing modern oncology trials requires synthesizing evidence from prior studies to inform hypothesis generation and sample size determination. Trial designs based on incomplete or imprecise summaries can lead to misspecified hypotheses and underpowered studies, resulting in false positive or negative conclusions. To address this challenge, we developed LEAD-ONC (Literature to Evidence for Analytics and Design in Oncology), an AI-assisted framework that transforms published clinical trial reports into quantitative, design-relevant evidence. Given expert-curated trial publications that meet prespecified eligibility criteria, LEAD-ONC uses large language models to extract baseline characteristics and reconstruct individual patient data from Kaplan-Meier curves, followed by Bayesian hierarchical modeling to generate predictive survival distributions for a prespecified target trial population. We demonstrate the framework using five phase III trials in first-line non-small-cell lung cancer evaluating PD-1 or PD-L1 inhibitors with or without CTLA-4 blockade. Clustering based on baseline characteristics identified three clinically interpretable populations defined by histology. For a prospective randomized trial in the mixed-histology population comparing mono versus dual immune checkpoint inhibition, LEAD-ONC projected a modest median overall survival difference of 2.8 months (95 percent credible interval -2.0 to 7.6) and an estimated probability of at least a 3-month benefit of approximately 0.45. As LEAD-ONC remains under active development, these results are intended as preliminary demonstrations of the frameworks potential to support evidence-driven oncology trial design rather than definitive clinical conclusions.

翻译：设计现代肿瘤学试验需要综合先前研究的证据，以指导假设生成和样本量确定。基于不完整或不精确总结的试验设计可能导致假设设定错误和研究效能不足，从而产生假阳性或假阴性结论。为应对这一挑战，我们开发了LEAD-ONC（肿瘤学文献到证据分析设计框架），这是一个将已发表的临床试验报告转化为定量化、设计相关证据的人工智能辅助框架。在给定符合预设资格标准的专家筛选试验文献后，LEAD-ONC利用大语言模型提取基线特征并从Kaplan-Meier曲线重建个体患者数据，随后通过贝叶斯分层模型生成预设目标试验人群的预测性生存分布。我们通过评估PD-1或PD-L1抑制剂联合或不联合CTLA-4阻断的五项一线非小细胞肺癌III期试验来演示该框架。基于基线特征的聚类分析识别出三个由组织学定义的临床可解释人群。针对混合组织学人群比较单药与双药免疫检查点抑制的前瞻性随机试验，LEAD-ONC预测中位总生存期差异为2.8个月（95%可信区间-2.0至7.6），获得至少3个月获益的估计概率约为0.45。由于LEAD-ONC仍处于积极开发阶段，这些结果旨在初步展示该框架支持证据驱动的肿瘤学试验设计的潜力，而非确证性临床结论。