State-of-the-art sequential recommendation relies heavily on self-attention-based recommender models. Yet such models are computationally expensive and often too slow for real-time recommendation. Furthermore, the self-attention operation is performed at a sequence-level, thereby making low-cost incremental inference challenging. Inspired by recent advances in efficient language modeling, we propose linear recurrent units for sequential recommendation (LRURec). Similar to recurrent neural networks, LRURec offers rapid inference and can achieve incremental inference on sequential inputs. By decomposing the linear recurrence operation and designing recursive parallelization in our framework, LRURec provides the additional benefits of reduced model size and parallelizable training. Moreover, we optimize the architecture of LRURec by implementing a series of modifications to address the lack of non-linearity and improve training dynamics. To validate the effectiveness of our proposed LRURec, we conduct extensive experiments on multiple real-world datasets and compare its performance against state-of-the-art sequential recommenders. Experimental results demonstrate the effectiveness of LRURec, which consistently outperforms baselines by a significant margin. Results also highlight the efficiency of LRURec with our parallelized training paradigm and fast inference on long sequences, showing its potential to further enhance user experience in sequential recommendation.
翻译:当前最先进的序列推荐系统高度依赖于基于自注意力的推荐模型。然而,这类模型计算成本高昂,且往往无法满足实时推荐的速度要求。此外,自注意力操作在序列级别执行,使得低成本增量推理变得困难。受近期高效语言建模进展的启发,我们提出了用于序列推荐的线性递归单元(LRURec)。与递归神经网络类似,LRURec能够实现快速推理,并支持对序列输入的增量推理。通过在线性递归操作中引入递归并行化设计,LRURec在降低模型规模的同时实现了可并行训练。此外,我们通过一系列架构优化措施(如增强非线性表达能力与改善训练动态)对LRURec进行性能提升。为验证LRURec的有效性,我们在多个真实数据集上开展广泛实验,并将其与当前最先进的序列推荐模型进行对比。实验结果表明,LRURec在性能上始终显著优于基线模型。同时,实验结果也凸显了LRURec通过并行训练范式与长序列快速推理所展现的高效性,彰显了其在提升序列推荐用户体验方面的潜力。