The sequential recommendation task aims to predict the item that user is interested in according to his/her historical action sequence. However, inevitable random action, i.e. user randomly accesses an item among multiple candidates or clicks several items at random order, cause the sequence fails to provide stable and high-quality signals. To alleviate the issue, we propose the StatisTics-Driven Pre-traing framework (called STDP briefly). The main idea of the work lies in the exploration of utilizing the statistics information along with the pre-training paradigm to stabilize the optimization of recommendation model. Specifically, we derive two types of statistical information: item co-occurrence across sequence and attribute frequency within the sequence. And we design the following pre-training tasks: 1) The co-occurred items prediction task, which encourages the model to distribute its attention on multiple suitable targets instead of just focusing on the next item that may be unstable. 2) We generate a paired sequence by replacing items with their co-occurred items and enforce its representation close with the original one, thus enhancing the model's robustness to the random noise. 3) To reduce the impact of random on user's long-term preferences, we encourage the model to capture sequence-level frequent attributes. The significant improvement over six datasets demonstrates the effectiveness and superiority of the proposal, and further analysis verified the generalization of the STDP framework on other models.
翻译:序列推荐任务旨在根据用户的历史行为序列预测其感兴趣的物品。然而,不可避免的随机行为(即用户从多个候选项中随机访问某个物品,或以随机顺序点击多个物品)导致序列无法提供稳定且高质量的信号。为缓解这一问题,我们提出了统计学驱动的预训练框架(简称STDP)。该工作的核心思想在于探索如何利用统计信息与预训练范式相结合,以稳定推荐模型的优化过程。具体而言,我们提取了两种统计信息:跨序列的物品共现信息和序列内的属性频率。我们设计了以下预训练任务:1) 共现物品预测任务,该任务鼓励模型将注意力分散到多个合适的候选目标上,而不是仅关注可能不稳定的下一个物品。2) 通过将物品替换为其共现物品来生成配对序列,并强制其表示与原始序列接近,从而增强模型对随机噪声的鲁棒性。3) 为减少随机行为对用户长期偏好的影响,我们鼓励模型捕获序列级别的频繁属性。在六个数据集上的显著改进证明了该方案的有效性和优越性,进一步分析验证了STDP框架在其他模型上的泛化能力。