Pairwise Ranking Loss for Multi-Task Learning in Recommender Systems

Multi-Task Learning (MTL) plays a crucial role in real-world advertising applications such as recommender systems, aiming to achieve robust representations while minimizing resource consumption. MTL endeavors to simultaneously optimize multiple tasks to construct a unified model serving diverse objectives. In online advertising systems, tasks like Click-Through Rate (CTR) and Conversion Rate (CVR) are often treated as MTL problems concurrently. However, it has been overlooked that a conversion ($y_{cvr}=1$) necessitates a preceding click ($y_{ctr}=1$). In other words, while certain CTR tasks are associated with corresponding conversions, others lack such associations. Moreover, the likelihood of noise is significantly higher in CTR tasks where conversions do not occur compared to those where they do, and existing methods lack the ability to differentiate between these two scenarios. In this study, exposure labels corresponding to conversions are regarded as definitive indicators, and a novel task-specific loss is introduced by calculating a \textbf{p}air\textbf{wise} \textbf{r}anking (PWiseR) loss between model predictions, manifesting as pairwise ranking loss, to encourage the model to rely more on them. To demonstrate the effect of the proposed loss function, experiments were conducted on different MTL and Single-Task Learning (STL) models using four distinct public MTL datasets, namely Alibaba FR, NL, US, and CCP, along with a proprietary industrial dataset. The results indicate that our proposed loss function outperforms the BCE loss function in most cases in terms of the AUC metric.

翻译：多任务学习（MTL）在推荐系统等现实世界广告应用中扮演着关键角色，其目标是在最小化资源消耗的同时获得鲁棒的表征。MTL致力于同时优化多个任务，以构建服务于不同目标的统一模型。在在线广告系统中，点击率（CTR）和转化率（CVR）等任务通常被同时视为MTL问题。然而，现有研究忽视了一个转化事件（$y_{cvr}=1$）必然需要一次先前的点击（$y_{ctr}=1$）。换言之，某些CTR任务关联着相应的转化，而另一些则缺乏这种关联。此外，与发生转化的CTR任务相比，未发生转化的CTR任务中噪声出现的可能性显著更高，而现有方法缺乏区分这两种情况的能力。在本研究中，将对应转化的曝光标签视为确定性指标，并通过计算模型预测之间的\textbf{成对排序}（PWiseR）损失引入一种新颖的任务特定损失函数，该损失表现为成对排序损失，以促使模型更多地依赖这些确定性指标。为验证所提出损失函数的效果，我们在四个不同的公开MTL数据集（即阿里巴巴FR、NL、US和CCP）以及一个专有工业数据集上，对不同MTL模型和单任务学习（STL）模型进行了实验。结果表明，在大多数情况下，我们提出的损失函数在AUC指标上优于BCE损失函数。