Adaptive treatment assignment algorithms, such as bandit and reinforcement learning algorithms, are increasingly used in digital health intervention clinical trials. Causal inference and related data analyses are critical for evaluating digital health interventions, deciding how to refine the intervention, and deciding whether to roll-out the intervention more broadly. However the replicability of these analyses has received relatively little attention. This work investigates the replicability of statistical analyses from trials deploying adaptive treatment assignment algorithms. We demonstrate that many standard statistical estimators can be inconsistent and fail to be replicable across repetitions of the clinical trial, even as the sample size grows large. We show that this non-replicability is intimately related to properties of the adaptive algorithm itself. We introduce a formal definition of a "replicable bandit algorithm" and prove that under such algorithms, a wide variety of common statistical analyses are guaranteed to be consistent. We present both theoretical results and simulation studies based on a mobile health oral health self-care intervention. Our findings underscore the importance of designing adaptive algorithms with replicability in mind, especially for settings like digital health where deployment decisions rely heavily on replicated evidence. We conclude by discussing open questions on the connections between algorithm design, statistical inference, and experimental replicability.
翻译:自适应治疗分配算法,如赌博机和强化学习算法,在数字健康干预临床试验中的应用日益广泛。因果推断及相关数据分析对于评估数字健康干预、决定如何改进干预措施以及判断是否应更广泛推广干预至关重要。然而,这些分析的可复现性却未得到足够重视。本研究探讨了在采用自适应治疗分配算法的试验中统计分析的可复现性问题。我们证明,即使样本量增大,许多标准统计估计量也可能不一致,且无法在临床试验的重复实施中保持可复现性。我们指出,这种不可复现性与自适应算法本身的特性密切相关。我们提出了“可复现赌博机算法”的形式化定义,并证明在此类算法下,多种常见统计分析能保证一致性。我们基于一项移动健康口腔自我护理干预研究,展示了理论结果与模拟实验。我们的发现强调了在设计自适应算法时考虑可复现性的重要性,尤其是在数字健康这类部署决策严重依赖可复现证据的场景中。最后,我们讨论了算法设计、统计推断与实验可复现性之间关联的开放性问题。