Modern recommender systems leverage large-scale retrieval models consisting of two stages: training a dual-encoder model to embed queries and candidates in the same space, followed by an Approximate Nearest Neighbor (ANN) search to select top candidates given a query's embedding. In this paper, we propose a new single-stage paradigm: a generative retrieval model which autoregressively decodes the identifiers for the target candidates in one phase. To do this, instead of assigning randomly generated atomic IDs to each item, we generate Semantic IDs: a semantically meaningful tuple of codewords for each item that serves as its unique identifier. We use a hierarchical method called RQ-VAE to generate these codewords. Once we have the Semantic IDs for all the items, a Transformer based sequence-to-sequence model is trained to predict the Semantic ID of the next item. Since this model predicts the tuple of codewords identifying the next item directly in an autoregressive manner, it can be considered a generative retrieval model. We show that our recommender system trained in this new paradigm improves the results achieved by current SOTA models on the Amazon dataset. Moreover, we demonstrate that the sequence-to-sequence model coupled with hierarchical Semantic IDs offers better generalization and hence improves retrieval of cold-start items for recommendations.
翻译:现代推荐系统依赖大规模检索模型,通常包含两个阶段:首先训练双编码器模型将查询和候选对象嵌入同一空间,随后通过近似最近邻搜索(ANN)从查询嵌入中选取最优候选对象。本文提出一种新型单阶段范式:生成式检索模型,该模型能在一个阶段内通过自回归方式解码目标候选对象的标识符。为此,我们并非为每个项目分配随机生成的原子标识符,而是生成语义标识符:一种由语义上有意义的码字元组构成的唯一项目标识符。我们采用名为RQ-VAE的分层方法生成这些码字。在获得所有项目的语义标识符后,我们训练基于Transformer的序列到序列模型来预测下一项目的语义标识符。由于该模型直接以自回归方式预测识别下一项目的码字元组,可被视为生成式检索模型。实验表明,基于这一新范式训练的推荐系统在Amazon数据集上提升了当前最优模型(SOTA)的性能。此外,我们证明结合分层语义标识符的序列到序列模型具有更好的泛化能力,从而能更有效地检索冷启动项目以用于推荐。