Entire Space Cascade Delayed Feedback Modeling for Effective Conversion Rate Prediction

Conversion rate (CVR) prediction is an essential task for large-scale e-commerce platforms. However, refund behaviors frequently occur after conversion in online shopping systems, which drives us to pay attention to effective conversion for building healthier shopping services. This paper defines the probability of item purchasing without any subsequent refund as an effective conversion rate (ECVR). A simple paradigm for ECVR prediction is to decompose it into two sub-tasks: CVR prediction and post-conversion refund rate (RFR) prediction. However, RFR prediction suffers from data sparsity (DS) and sample selection bias (SSB) issues, as the refund behaviors are only available after user purchase. Furthermore, there is delayed feedback in both conversion and refund events and they are sequentially dependent, named cascade delayed feedback (CDF), which significantly harms data freshness for model training. Previous studies mainly focus on tackling DS and SSB or delayed feedback for a single event. To jointly tackle these issues in ECVR prediction, we propose an Entire space CAscade Delayed feedback modeling (ECAD) method. Specifically, ECAD deals with DS and SSB by constructing two tasks including CVR prediction and conversion \& refund rate (CVRFR) prediction using the entire space modeling framework. In addition, it carefully schedules auxiliary tasks to leverage both conversion and refund time within data to alleviate CDF. Experimental results on the offline industrial dataset and online A/B testing demonstrate the effectiveness of ECAD. In addition, ECAD has been deployed in one of the recommender systems in Alibaba, contributing to a significant improvement of ECVR.

翻译：转化率（CVR）预测是大规模电商平台的核心任务之一。然而，在线购物系统中，用户在购买后频繁发生退款行为，这促使我们关注有效转化，以构建更健康的购物服务。本文将无后续退款行为的商品购买概率定义为有效转化率（ECVR）。ECVR预测的简单范式可分解为两个子任务：CVR预测和转化后退款率（RFR）预测。然而，RFR预测面临数据稀疏性（DS）和样本选择偏差（SSB）问题，因为退款行为仅在用户购买后才能观测到。此外，转化和退款事件均存在延迟反馈，且两者存在顺序依赖关系，称为级联延迟反馈（CDF），这严重损害了模型训练的数据时效性。以往研究主要针对单一事件解决DS、SSB或延迟反馈问题。为联合解决ECVR预测中的上述问题，本文提出全空间级联延迟反馈建模（ECAD）方法。具体而言，ECAD通过基于全空间建模框架构建CVR预测和转化与退款率（CVRFR）预测两个任务，从而处理DS和SSB问题。同时，它精心调度辅助任务，充分利用数据中的转化时间和退款时间来缓解CDF问题。离线工业数据集和在线A/B测试的实验结果验证了ECAD的有效性。此外，ECAD已部署于阿里巴巴推荐系统之一，显著提升了ECVR指标。