Click-through rate (CTR) prediction holds significant importance in the realm of online advertising. While many existing approaches treat it as a binary classification problem and utilize binary cross entropy (BCE) as the optimization objective, recent advancements have indicated that combining BCE loss with ranking loss yields substantial performance improvements. However, the full efficacy of this combination loss remains incompletely understood. In this paper, we uncover a new challenge associated with BCE loss in scenarios with sparse positive feedback, such as CTR prediction: the gradient vanishing for negative samples. Subsequently, we introduce a novel perspective on the effectiveness of ranking loss in CTR prediction, highlighting its ability to generate larger gradients on negative samples, thereby mitigating their optimization issues and resulting in improved classification ability. Our perspective is supported by extensive theoretical analysis and empirical evaluation conducted on publicly available datasets. Furthermore, we successfully deployed the ranking loss in Tencent's online advertising system, achieving notable lifts of 0.70% and 1.26% in Gross Merchandise Value (GMV) for two main scenarios. The code for our approach is openly accessible at the following GitHub repository: https://github.com/SkylerLinn/Understanding-the-Ranking-Loss.
翻译:点击率(CTR)预测在在线广告领域具有重要地位。尽管许多现有方法将其视为二分类问题,并使用二元交叉熵(BCE)作为优化目标,但最新进展表明,将BCE损失与排序损失结合使用能够显著提升性能。然而,这种组合损失的全部效力尚不完全清楚。在本文中,我们揭示了在稀疏正反馈场景(如CTR预测)中BCE损失面临的新挑战:负样本的梯度消失问题。随后,我们提出了关于排序损失在CTR预测中有效性的新视角,强调其能够在负样本上产生更大梯度,从而缓解其优化问题并提升分类能力。我们的观点通过广泛的理论分析和公开数据集的实证评估得到支持。此外,我们成功将排序损失部署于腾讯在线广告系统,在两个主要场景中分别实现了商品交易总额(GMV)0.70%和1.26%的显著提升。相关方法的代码已在以下GitHub仓库公开:https://github.com/SkylerLinn/Understanding-the-Ranking-Loss。