Phased releases are a common strategy in the technology industry for gradually releasing new products or updates through a sequence of A/B tests in which the number of treated units gradually grows until full deployment or deprecation. Performing phased releases in a principled way requires selecting the proportion of units assigned to the new release in a way that balances the risk of an adverse effect with the need to iterate and learn from the experiment rapidly. In this paper, we formalize this problem and propose an algorithm that automatically determines the release percentage at each stage in the schedule, balancing the need to control risk while maximizing ramp-up speed. Our framework models the challenge as a constrained batched bandit problem that ensures that our pre-specified experimental budget is not depleted with high probability. Our proposed algorithm leverages an adaptive Bayesian approach in which the maximal number of units assigned to the treatment is determined by the posterior distribution, ensuring that the probability of depleting the remaining budget is low. Notably, our approach analytically solves the ramp sizes by inverting probability bounds, eliminating the need for challenging rare-event Monte Carlo simulation. It only requires computing means and variances of outcome subsets, making it highly efficient and parallelizable.
翻译:分阶段发布是科技行业中一种常见的策略,通过一系列A/B测试逐步推出新产品或更新,其中受试单元的数量逐渐增加,直至全面部署或废弃。以系统化的方式进行分阶段发布需要选择分配给新版本的单元比例,从而在控制不良影响风险与快速迭代并从实验中学习的需求之间取得平衡。在本文中,我们对此问题进行了形式化,并提出了一种算法,该算法能自动确定计划中每个阶段的发布百分比,在控制风险的同时最大化加速部署速度。我们的框架将这一挑战建模为一个受约束的批量老虎机问题,确保预定的实验预算以高概率不被耗尽。所提出的算法采用了一种自适应贝叶斯方法,其中分配给实验组的最大单元数由后验分布决定,从而确保剩余预算被耗尽的概率较低。值得注意的是,我们的方法通过反转概率界限解析求解了加速规模,避免了具有挑战性的稀有事件蒙特卡洛模拟。它仅需计算结果子集的均值和方差,因此具有高效率和可并行化的优点。