Generative retrieval methods employ sequential modeling techniques, like transformers, to generate candidate items for recommender systems. These methods have demonstrated promising results in academic benchmarks, surpassing traditional retrieval models such as two tower architectures. However, a key limitation is that current approaches require a separate model for each product surface, as building a unified model that accommodates the different business needs of various surfaces has proven challenging. Furthermore, existing methods often fail to capture the evolution of user interests over a sequence, focusing instead on only predicting the next item. This paper introduces PinRec, a novel unified generative retrieval model for all of Pinterest recommendation surfaces, including home feed, search, and related pins. PinRec is pretrained on user activity sequences aggregated across surfaces, then finetuned for each surface using impression data from that surface. This pretraining and finetuning approach enables a single unified model while still adapting to the needs of individual surfaces. To better align recommendations with surface specific business goals, PinRec incorporates a novel outcome conditioned generation mechanism that targets different outcomes for each surface, which further enhances the impact of finetuning. Our experiments show that PinRec balances performance, diversity, and efficiency, delivering significant gains such as +4% increase in search saves. To our knowledge, this paper presents the first rigorous study of a unified generative retrieval model built and deployed at Pinterest scale, marking a significant milestone in the field.
翻译:生成式检索方法采用序列建模技术(如Transformer)为推荐系统生成候选项目。这些方法在学术基准测试中展现出优异性能,超越了双塔架构等传统检索模型。然而,当前方法存在一个关键局限:由于构建能够适应不同产品界面业务需求的统一模型具有挑战性,现有方案需要为每个产品界面单独训练模型。此外,现有方法通常难以捕捉用户兴趣在序列中的演化过程,仅聚焦于预测下一项目。本文提出PinRec——一种面向Pinterest所有推荐界面(包括首页信息流、搜索和相关图钉)的新型统一生成式检索模型。PinRec首先通过聚合多界面用户行为序列进行预训练,随后利用各界面的曝光数据进行针对性微调。这种预训练-微调范式使得单一统一模型能够适配不同界面的个性化需求。为更好地实现推荐目标与界面特定业务需求的匹配,PinRec引入创新的结果条件生成机制,针对不同界面设定差异化优化目标,从而进一步增强微调效果。实验表明,PinRec在性能、多样性和效率之间取得平衡,带来显著增益(如搜索收藏量提升+4%)。据我们所知,本文首次对Pinterest规模下构建与部署的统一生成式检索模型进行了系统性研究,标志着该领域的重要里程碑。