AutoOffAB: Toward Automated Offline A/B Testing for Data-Driven Requirement Engineering

Software companies have widely used online A/B testing to evaluate the impact of a new technology by offering it to groups of users and comparing it against the unmodified product. However, running online A/B testing needs not only efforts in design, implementation, and stakeholders' approval to be served in production but also several weeks to collect the data in iterations. To address these issues, a recently emerging topic, called "Offline A/B Testing", is getting increasing attention, intending to conduct the offline evaluation of new technologies by estimating historical logged data. Although this approach is promising due to lower implementation effort, faster turnaround time, and no potential user harm, for it to be effectively prioritized as requirements in practice, several limitations need to be addressed, including its discrepancy with online A/B test results, and lack of systematic updates on varying data and parameters. In response, in this vision paper, I introduce AutoOffAB, an idea to automatically run variants of offline A/B testing against recent logging and update the offline evaluation results, which are used to make decisions on requirements more reliably and systematically.

翻译：软件公司已广泛采用在线A/B测试来评估新技术的影响，其方法是将技术提供给用户群体并与未修改的产品进行比较。然而，实施在线A/B测试不仅需要在设计、实现和利益相关者批准投入生产方面付出努力，还需要数周时间以迭代方式收集数据。为解决这些问题，一个新兴课题——"离线A/B测试"正受到越来越多的关注，其目标是通过分析历史日志数据对新技术的效果进行离线评估。尽管该方法因实施成本较低、周转时间更快且不会对用户造成潜在损害而前景广阔，但要使其在实践中作为需求被有效优先考虑，仍需解决若干局限性，包括与在线A/B测试结果的差异，以及对变化数据和参数缺乏系统性更新。为此，本愿景论文提出AutoOffAB构想：通过针对近期日志数据自动运行多版本离线A/B测试并持续更新评估结果，使需求决策过程更可靠、更系统化。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日