Click-through rate (CTR) prediction is a vital task in industrial recommendation systems. Most existing methods focus on the network architecture design of the CTR model for better accuracy and suffer from the data sparsity problem. Especially in industrial recommendation systems, the widely applied negative sample down-sampling technique due to resource limitation worsens the problem, resulting in a decline in performance. In this paper, we propose \textbf{A}uxiliary Match \textbf{T}asks for enhancing \textbf{C}lick-\textbf{T}hrough \textbf{R}ate prediction accuracy (AT4CTR) by alleviating the data sparsity problem. Specifically, we design two match tasks inspired by collaborative filtering to enhance the relevance modeling between user and item. As the "click" action is a strong signal which indicates the user's preference towards the item directly, we make the first match task aim at pulling closer the representation between the user and the item regarding the positive samples. Since the user's past click behaviors can also be treated as the user him/herself, we apply the next item prediction as the second match task. For both the match tasks, we choose the InfoNCE as their loss function. The two match tasks can provide meaningful training signals to speed up the model's convergence and alleviate the data sparsity. We conduct extensive experiments on one public dataset and one large-scale industrial recommendation dataset. The result demonstrates the effectiveness of the proposed auxiliary match tasks. AT4CTR has been deployed in the real industrial advertising system and has gained remarkable revenue.
翻译:点击率(CTR)预测是工业推荐系统中的一项关键任务。现有大多数方法聚焦于CTR模型的网络架构设计以提升精度,但往往受限于数据稀疏问题。尤其在工业推荐系统中,因资源限制而广泛采用的负样本下采样技术加剧了这一难题,导致性能下降。本文提出**辅**助**匹**配任务增强**点**击**率**预测(AT4CTR),旨在通过缓解数据稀疏问题提升CTR预测精度。具体而言,受协同过滤启发,我们设计了两类匹配任务以增强用户与物品间的相关性建模。由于“点击”行为是直接反映用户偏好的强信号,我们将第一类匹配任务设为针对正样本拉近用户与物品表征的距离。鉴于用户历史点击行为可被视作用户自身的映射,我们将下一物品预测作为第二类匹配任务。两类匹配任务均采用InfoNCE作为损失函数,能够提供有效的训练信号以加速模型收敛并缓解数据稀疏问题。我们在一个公开数据集和一个大规模工业推荐数据集上开展了广泛实验,结果验证了所提辅助匹配任务的有效性。目前AT4CTR已在真实工业广告系统部署,并取得了显著收益。