Many modern tech companies, such as Google, Uber, and Didi, utilize online experiments (also known as A/B testing) to evaluate new policies against existing ones. While most studies concentrate on average treatment effects, situations with skewed and heavy-tailed outcome distributions may benefit from alternative criteria, such as quantiles. However, assessing dynamic quantile treatment effects (QTE) remains a challenge, particularly when dealing with data from ride-sourcing platforms that involve sequential decision-making across time and space. In this paper, we establish a formal framework to calculate QTE conditional on characteristics independent of the treatment. Under specific model assumptions, we demonstrate that the dynamic conditional QTE (CQTE) equals the sum of individual CQTEs across time, even though the conditional quantile of cumulative rewards may not necessarily equate to the sum of conditional quantiles of individual rewards. This crucial insight significantly streamlines the estimation and inference processes for our target causal estimand. We then introduce two varying coefficient decision process (VCDP) models and devise an innovative method to test the dynamic CQTE. Moreover, we expand our approach to accommodate data from spatiotemporal dependent experiments and examine both conditional quantile direct and indirect effects. To showcase the practical utility of our method, we apply it to three real-world datasets from a ride-sourcing platform. Theoretical findings and comprehensive simulation studies further substantiate our proposal.
翻译:许多现代科技公司,如谷歌、优步和滴滴,利用在线实验(也称为A/B测试)来评估新策略相较于现有策略的效果。尽管大多数研究关注平均处理效应,但在结果分布呈现偏态和厚尾特征的情况下,采用替代标准(如分位数)可能更为有益。然而,评估动态分位数处理效应仍然是一个挑战,尤其是在处理涉及跨时间和空间序贯决策的网约车平台数据时。本文建立了一个正式框架,用于计算独立于处理变量的条件分位数处理效应。在特定模型假设下,我们证明了动态条件分位数处理效应等于跨时间个体条件分位数处理效应之和,尽管累积奖励的条件分位数不一定等于个体奖励条件分位数之和。这一关键发现显著简化了我们目标因果量的估计和推断过程。随后,我们引入了两种变系数决策过程模型,并设计了一种创新方法来检验动态条件分位数处理效应。此外,我们将方法扩展至处理时空依赖实验的数据,并检验了条件分位数的直接与间接效应。为展示方法的实际应用价值,我们将其应用于网约车平台的三个真实世界数据集。理论发现与全面的仿真研究进一步验证了我们所提方法的有效性。