A/B testing is a common approach used in industry to facilitate innovation through the introduction of new features or the modification of existing software. Traditionally, A/B tests are conducted sequentially, with each experiment targeting the entire population of the corresponding application. This approach can be time-consuming and costly, particularly when the experiments are not relevant to the entire population. To tackle these problems, we introduce a new self-adaptive approach called AutoPABS, short for Automated Pipelines of A/B tests using Self-adaptation, that (1) automates the execution of pipelines of A/B tests, and (2) supports a split of the population in the pipeline to divide the population into multiple A/B tests according to user-based criteria, leveraging machine learning. We started the evaluation with a small survey to probe the appraisal of the notation and infrastructure of AutoPABS. Then we performed a series of tests to measure the gains obtained by applying a population split in an automated A/B testing pipeline, using an extension of the SEAByTE artifact. The survey results show that the participants express the usefulness of automating A/B testing pipelines and population split. The tests show that automatically executing pipelines of A/B tests with a population split accelerates the identification of statistically significant results of the parallel executed experiments of A/B tests compared to a traditional approach that performs the experiments sequentially.
翻译:A/B测试是工业界常用的一种方法,通过引入新功能或修改现有软件来促进创新。传统上,A/B测试按顺序进行,每个实验针对相应应用的全体用户。这种方法可能耗时且成本高昂,尤其在实验与全体用户不相关时。为解决这些问题,我们提出了一种名为AutoPABS(全称:Automated Pipelines of A/B tests using Self-adaptation,即基于自适应的A/B测试流水线自动化)的新型自适应方法,该方法能够:(1)自动化执行A/B测试流水线;(2)支持流水线中根据用户条件对用户群体进行拆分,将人群划分到多个A/B测试中,并利用机器学习技术。我们首先通过一项小型调查来评估AutoPABS符号体系与基础设施的可行性。随后,利用SEAByTE工具的扩展版本,开展了一系列测试以衡量在自动化A/B测试流水线中应用人群拆分所带来的效益。调查结果显示,参与者肯定了A/B测试流水线自动化及人群拆分的实用性。测试表明,与传统顺序执行实验的方法相比,采用人群拆分的自动化A/B测试流水线能够加速并行实验中统计显著结果的识别。