eBay's experimentation platform runs hundreds of A/B tests on any given day. The platform integrates with the tracking infrastructure and customer experience servers, provides the sampling service for experiments, and has the responsibility to monitor the progress of each A/B test. There are many challenges especially when it is required to ensure experiment quality at the large scale. We discuss two automated test quality monitoring processes and methodologies, namely randomization validation using population stability index (PSI) and sample ratio mismatch (a.k.a. sample delta) detection using sequential analysis. The automated processes assist the experimentation platform to run high quality and trustworthy tests not only effectively on a large scale, but also efficiently by minimizing false positive monitoring alarms to experimenters.
翻译:eBay的实验平台每天运行数百个A/B测试。该平台与追踪基础设施和客户体验服务器集成,提供实验的采样服务,并负责监控每个A/B测试的进展。在需要确保大规模实验质量时,存在诸多挑战。我们讨论了两种自动化测试质量监控流程与方法,即基于群体稳定性指数(PSI)的随机化验证,以及利用序贯分析进行样本比例失配(即样本差异)检测。这些自动化流程不仅帮助实验平台高效地大规模运行高质量且可信赖的测试,还通过最大限度减少向实验者发出误报监控警报来提高效率。