Adaptive treatment assignment algorithms, such as bandit and reinforcement learning algorithms, are increasingly used in digital health intervention clinical trials. Causal inference and related data analyses are critical for evaluating digital health interventions, deciding how to refine the intervention, and deciding whether to roll-out the intervention more broadly. However the replicability of these analyses has received relatively little attention. This work investigates the replicability of statistical analyses from trials deploying adaptive treatment assignment algorithms. We demonstrate that many standard statistical estimators can be inconsistent and fail to be replicable across repetitions of the clinical trial, even as the sample size grows large. We show that this non-replicability is intimately related to properties of the adaptive algorithm itself. We introduce a formal definition of a "replicable bandit algorithm" and prove that under such algorithms, a wide variety of common statistical analyses are guaranteed to be consistent. We present both theoretical results and simulation studies based on a mobile health oral health self-care intervention. Our findings underscore the importance of designing adaptive algorithms with replicability in mind, especially for settings like digital health where deployment decisions rely heavily on replicated evidence. We conclude by discussing open questions on the connections between algorithm design, statistical inference, and experimental replicability.
翻译:自适应治疗分配算法,如赌博机算法和强化学习算法,在数字健康干预临床试验中的应用日益广泛。因果推断及相关数据分析对于评估数字健康干预措施、决定如何改进干预方案以及判断是否应更广泛推广干预措施至关重要。然而,这些分析的可复现性尚未得到足够重视。本研究探讨了采用自适应治疗分配算法的试验中统计分析的可复现性问题。我们证明,即使样本量不断增大,许多标准统计估计量仍可能不一致,且无法在临床试验的重复实施中保持可复现性。我们指出这种不可复现性与自适应算法本身的性质密切相关。我们提出了“可复现赌博机算法”的形式化定义,并证明在此类算法框架下,多种常见统计分析能够保证一致性。我们通过理论推导和基于移动健康口腔自我护理干预的模拟研究进行验证。研究结果强调了在设计自适应算法时考虑可复现性的重要性,特别是在数字健康这类高度依赖可复现证据进行部署决策的领域。最后,我们探讨了算法设计、统计推断与实验可复现性之间关联的开放性问题。