Transformer models have achieved remarkable success in sequential recommender systems (SRSs). However, computing the attention matrix in traditional dot-product attention mechanisms results in a quadratic complexity with sequence lengths, leading to high computational costs for long-term sequential recommendation. Motivated by the above observation, we propose a novel L2-Normalized Linear Attention for the Transformer-based Sequential Recommender Systems (LinRec), which theoretically improves efficiency while preserving the learning capabilities of the traditional dot-product attention. Specifically, by thoroughly examining the equivalence conditions of efficient attention mechanisms, we show that LinRec possesses linear complexity while preserving the property of attention mechanisms. In addition, we reveal its latent efficiency properties by interpreting the proposed LinRec mechanism through a statistical lens. Extensive experiments are conducted based on two public benchmark datasets, demonstrating that the combination of LinRec and Transformer models achieves comparable or even superior performance than state-of-the-art Transformer-based SRS models while significantly improving time and memory efficiency.
翻译:Transformer模型在序列推荐系统中取得了显著成功。然而,传统点积注意力机制中注意力矩阵的计算具有相对于序列长度的二次复杂度,导致长期序列推荐面临高昂的计算成本。基于上述观察,我们为基于Transformer的序列推荐系统提出了一种新颖的L2归一化线性注意力机制(LinRec),该机制在理论上提升了效率,同时保留了传统点积注意力的学习能力。具体而言,通过深入分析高效注意力机制的等价条件,我们证明了LinRec在保持注意力机制特性的同时具有线性复杂度。此外,我们通过统计视角解释所提出的LinRec机制,揭示了其潜在的效率特性。基于两个公开基准数据集的大量实验表明,LinRec与Transformer模型的结合在显著提升时间和内存效率的同时,取得了与最先进的基于Transformer的序列推荐模型相当甚至更优的性能。