Statistical Design of Pragmatic Trials Using Electronic Health Record Data when Outcome Assessments are Uncontrolled and Irregular

Pragmatic trials increasingly define outcomes using real-world data such as electronic health records, where assessments are collected during routine care rather than at fixed timepoints. Consequently, these uncontrolled assessments may be irregular, sparse, and affected by the intervention (intervention-dependent assessments), which can lead to biased treatment effect estimates. We developed a simulation study to inform the statistical approach for trials with uncontrolled assessments, which we applied to the MI-CARE pragmatic trial. Using a pre-trial cohort mimicking eligibility and outcome measurement, we estimated assessment frequency and timing and combined these estimates with assumptions about how the intervention effects might impact assessment. We simulated sparse and intervention-dependent assessments and compared single-measure approaches with longitudinal models using all scores. Under intervention-dependent assessments, we found that naive methods such as using the best score or using a randomly selected score without adjusting for measurement timing produced substantial bias. Models that adjusted flexibly for the follow-up timing estimated time-point specific or time-averaged treatment effects without bias. Simulation results informed the selection of the statistical approach for the MI-CARE trial. Among unbiased methods, the most powerful was a linear mixed model with exponential correlation structure, adjustment for time since baseline, and a time-varying intervention effect to estimate the intervention effect at the end of the intervention window. Future studies can use pre-trial data to conduct a simulation study tailored to the trial's data features to inform the analytic approach. Trials with uncontrolled assessments should consider the potential for intervention-dependent assessments and select an appropriate method to avoid bias.

翻译：实用性试验越来越多地采用真实世界数据（如电子健康记录）定义结局，此类评估在常规诊疗中收集而非固定时间点进行。因此，这些不受控的评估可能呈现不规律、稀疏性，并受到干预措施影响（干预依赖性评估），导致治疗效果估计产生偏倚。我们开展了一项模拟研究，为具有不受控评估特征的试验提供统计方法指导，并将其应用于MI-CARE实用性试验。通过模拟符合入组条件与结局测量的试验前队列，我们估算了评估频率与时间分布，并综合干预措施可能影响评估的假设。我们模拟了稀疏性和干预依赖性评估，比较了单次测量方法与使用所有评分的纵向模型。研究发现，在干预依赖性评估下，简单方法（如使用最佳评分或随机选取评分而未调整测量时间）会产生显著偏倚。灵活调整随访时间的模型可无偏估计时间点特异性或时间平均治疗效果。模拟结果指导了MI-CARE试验的统计方法选择。在无偏方法中，统计效能最高的是采用指数相关结构、调整基线后随访时间、并纳入时变干预效应的线性混合模型，用于估计干预窗口结束时的干预效果。未来研究可利用试验前数据开展符合试验数据特征的定制化模拟研究以指导分析方法选择。具有不受控评估特征的试验应考虑干预依赖性评估的可能性，并选择适当方法以避免偏倚。