Gated Rotary-Enhanced Linear Attention with Rank Modulation for Long-term Sequential Recommendation

In Sequential Recommendation Systems (SRSs), Transformer models have demonstrated remarkable performance but face computational and memory cost challenges, especially when modeling long-term user behavior sequences. Due to its quadratic complexity, the dot-product attention mechanism in Transformers becomes expensive for processing long sequences. By approximating the dot-product attention using elaborate mapping functions, linear attention provides a more efficient option with linear complexity. However, existing linear attention methods face three limitations: 1) they often use learnable position encodings, which incur extra computational costs in long-term sequence scenarios, 2) limited by the low-rank deficiency, they may not sufficiently account for user's fine-grained local preferences (short-lived burst of interest), and 3) they try to capture some temporary activities, but often confuse these with stable and long-term interests. This can result in unclear or less effective recommendations. To remedy these drawbacks, we propose a long-term sequential Recommendation model with Gated Rotary Enhanced Linear Attention (RecGRELA). Specifically, we first propose a Rotary-Enhanced Linear Attention (RELA) module to efficiently model long-range dependency within the user's historical information using rotary position encodings. Then, to address the low-rank deficiency of linear attention, we introduce an Adaptive Rank Modulator. It incorporates a rank augmentation branch to explicitly inject local token mixing and a Gated Rank Selector to dynamically balance stable long-term preferences and transient short-term interests. Experimental results on four public benchmark datasets show that our RecGRELA achieves state-of-the-art performance compared with existing SRSs based on Recurrent Neural Networks, Transformer, and Mamba while keeping low memory overhead.

翻译：在序列推荐系统中，Transformer 模型展现出卓越性能，但在建模用户长期行为序列时面临计算与内存成本挑战。由于点积注意力的平方复杂度，Transformer 在处理长序列时变得昂贵。通过使用精心设计的映射函数近似点积注意力，线性注意力提供了线性复杂度的更高效选择。然而，现有线性注意力方法存在三个局限：1) 常使用可学习位置编码，在长期序列场景中产生额外计算开销；2) 受限于低秩不足，可能无法充分刻画用户细粒度的局部偏好（短暂的兴趣爆发）；3) 虽试图捕获某些临时活动，却常将其与稳定长期兴趣混淆，导致推荐不清晰或效果不佳。为弥补上述缺陷，我们提出采用门控旋转增强线性注意力（RecGRELA）的长期序列推荐模型。具体而言，首先提出旋转增强线性注意力（RELA）模块，利用旋转位置编码高效建模用户历史信息中的长程依赖。其次，为解决线性注意力的低秩缺陷，引入自适应秩调制器：包含用于显式注入局部标记混合的秩增强分支，以及动态平衡稳定长期偏好与瞬时短期兴趣的门控秩选择器。在四个公开基准数据集上的实验结果表明，与基于循环神经网络、Transformer 和 Mamba 的现有序列推荐系统相比，RecGRELA 在保持低内存开销的同时实现了最先进的性能。