A/B testing, or controlled experiments, is the gold standard approach to causally compare the performance of algorithms on online platforms. However, conventional Bernoulli randomization in A/B testing faces many challenges such as spillover and carryover effects. Our study focuses on another challenge, especially for A/B testing on two-sided platforms -- budget constraints. Buyers on two-sided platforms often have limited budgets, where the conventional A/B testing may be infeasible to be applied, partly because two variants of allocation algorithms may conflict and lead some buyers to exceed their budgets if they are implemented simultaneously. We develop a model to describe two-sided platforms where buyers have limited budgets. We then provide an optimal experimental design that guarantees small bias and minimum variance. Bias is lower when there is more budget and a higher supply-demand rate. We test our experimental design on both synthetic data and real-world data, which verifies the theoretical results and shows our advantage compared to Bernoulli randomization.
翻译:A/B测试(即对照实验)是在线平台上因果比较算法性能的黄金标准方法。然而,传统A/B测试中的伯努利随机化面临诸多挑战,如溢出效应和携带效应。本研究聚焦于另一个挑战——特别是双边平台上的A/B测试——预算约束。双边平台上的买家通常预算有限,导致传统A/B测试难以适用,部分原因是两种分配算法变体可能相互冲突,若同时实施,可能导致部分买家超支。我们构建了一个描述有限预算买家双边平台的模型,并据此设计了一种最优实验方案,该方案能保证偏差小且方差最小。当预算更充足、供需比率更高时,偏差会降低。我们在合成数据和真实数据上验证了该实验设计,结果证实了理论推导,并表明其相较于伯努利随机化的优势。