Recent years have witnessed success of sequential modeling, generative recommender, and large language model for recommendation. Though the scaling law has been validated for sequential models, it showed inefficiency in computational capacity when considering real-world applications like recommendation, due to the non-linear(quadratic) increasing nature of the transformer model. To improve the efficiency of the sequential model, we introduced a novel approach to sequential recommendation that leverages personalization techniques to enhance efficiency and performance. Our method compresses long user interaction histories into learnable tokens, which are then combined with recent interactions to generate recommendations. This approach significantly reduces computational costs while maintaining high recommendation accuracy. Our method could be applied to existing transformer based recommendation models, e.g., HSTU and HLLM. Extensive experiments on multiple sequential models demonstrate its versatility and effectiveness. Source code is available at \href{https://github.com/facebookresearch/PerSRec}{https://github.com/facebookresearch/PerSRec}.
翻译:近年来,序列建模、生成式推荐器以及用于推荐的大语言模型取得了显著成功。尽管序列模型的扩展定律已得到验证,但考虑到推荐等实际应用场景时,由于Transformer模型计算复杂度呈非线性(二次)增长的特性,其在计算效率方面仍显不足。为提升序列模型的效率,我们提出了一种新颖的序列推荐方法,该方法利用个性化技术来增强效率与性能。我们的方法将冗长的用户交互历史压缩为可学习的表征向量,随后将其与近期交互记录相结合以生成推荐。该方法在保持高推荐精度的同时,显著降低了计算成本。本方法可应用于现有的基于Transformer的推荐模型(例如HSTU和HLLM)。在多种序列模型上进行的大量实验证明了其普适性与有效性。源代码发布于 \href{https://github.com/facebookresearch/PerSRec}{https://github.com/facebookresearch/PerSRec}。