Rank models play a key role in industrial recommender systems, advertising, and search engines. Existing works utilize semantic tags and user-item interaction behaviors, e.g., clicks, views, etc., to predict the user interest and the item hidden representation for estimating the user-item preference score. However, these behavior-tag-based models encounter great challenges and reduced effectiveness when user-item interaction activities are insufficient, which we called "the long-tail ranking problem". Existing rank models ignore this problem, but its common and important because any user or item can be long-tailed once they are not consistently active for a short period. In this paper, we propose a novel neighbor enhancement structure to help train the representation of the target user or item. It takes advantage of similar neighbors (static or dynamic similarity) with multi-level attention operations balancing the weights of different neighbors. Experiments on the well-known public dataset MovieLens 1M demonstrate the efficiency of the method over the baseline behavior-tag-based model with an absolute CTR AUC gain of 0.0259 on the long-tail user dataset.
翻译:排序模型在工业推荐系统、广告以及搜索引擎中发挥着关键作用。现有工作利用语义标签和用户-物品交互行为(例如点击、观看等)来预测用户兴趣及物品隐表示,进而估计用户-物品偏好得分。然而,当用户-物品交互活动不足时,这些基于行为标签的模型会面临巨大挑战且效果显著下降,我们将此问题称为“长尾排序问题”。现有排序模型忽视了该问题,但由于任何用户或物品一旦在短时间内不持续活跃便可能成为长尾,这一问题既普遍又重要。本文提出了一种新颖的邻域增强结构,用于辅助目标用户或物品的表示训练。该结构利用相似邻域(静态或动态相似性)并结合多层级注意力操作,以平衡不同邻域之间的权重。在知名公开数据集MovieLens 1M上的实验表明,该方法相比基于行为标签的基线模型具有更高效率,在长尾用户数据集上实现了0.0259的绝对CTR AUC提升。