Modern recommendation systems primarily rely on attention mechanisms with quadratic complexity, which limits their ability to handle long user sequences and slows down inference. While linear attention is a promising alternative, existing research faces three critical challenges: (1) temporal signals are often overlooked or integrated via naive coupling that causes mutual interference between temporal and semantic signals while neglecting behavioral periodicity; (2) insufficient positional information provided by existing linear frameworks; and (3) a primary focus on short sequences and shallow architectures. To address these issues, we propose FuXi-Linear, a linear-complexity model designed for efficient long-sequence recommendation. Our approach introduces two key components: (1) a Temporal Retention Channel that independently computes periodic attention weights using temporal data, preventing crosstalk between temporal and semantic signals; (2) a Linear Positional Channel that integrates positional information through learnable kernels within linear complexity. Moreover, we demonstrate that FuXi-Linear exhibits a robust power-law scaling property at a thousand-length scale, a characteristic largely unexplored in prior linear recommendation studies. Extensive experiments on sequences of several thousand tokens demonstrate that FuXi-Linear outperforms state-of-the-art models in recommendation quality, while achieving up to 10$\times$ speedup in the prefill stage and up to 21$\times$ speedup in the decode stage compared to competitive baselines. Our code has been released in a public repository https://github.com/USTC-StarTeam/fuxi-linear.
翻译:现代推荐系统主要依赖具有二次复杂度的注意力机制,这限制了其处理长用户序列的能力并减慢了推理速度。虽然线性注意力是一种有前景的替代方案,但现有研究面临三个关键挑战:(1) 时序信号常被忽视或通过简单耦合集成,导致时序与语义信号相互干扰,同时忽略了行为周期性;(2) 现有线性框架提供的的位置信息不足;(3) 主要关注短序列和浅层架构。为解决这些问题,我们提出了FuXi-Linear,一种专为高效长序列推荐设计的线性复杂度模型。我们的方法引入了两个关键组件:(1) 一个时序保持通道,它使用时序数据独立计算周期性注意力权重,防止时序与语义信号之间的串扰;(2) 一个线性位置通道,它通过可学习核在线性复杂度内集成位置信息。此外,我们证明了FuXi-Linear在千级长度尺度上展现出稳健的幂律缩放特性,这一特性在以往的线性推荐研究中很大程度上未被探索。在数千个令牌长度的序列上进行的大量实验表明,FuXi-Linear在推荐质量上优于最先进的模型,同时与竞争基线相比,在预填充阶段实现了高达10倍的加速,在解码阶段实现了高达21倍的加速。我们的代码已在公共仓库 https://github.com/USTC-StarTeam/fuxi-linear 中发布。