In an era dominated by information overload, effective recommender systems are essential for managing the deluge of data across digital platforms. Multi-stage cascade ranking systems are widely used in the industry, with retrieval and ranking being two typical stages. Retrieval methods sift through vast candidates to filter out irrelevant items, while ranking methods prioritize these candidates to present the most relevant items to users. Unlike studies focusing on the ranking stage, this survey explores the critical yet often overlooked retrieval stage of recommender systems. To achieve precise and efficient personalized retrieval, we summarize existing work in three key areas: improving similarity computation between user and item, enhancing indexing mechanisms for efficient retrieval, and optimizing training methods of retrieval. We also provide a comprehensive set of benchmarking experiments on three public datasets. Furthermore, we highlight current industrial applications through a case study on retrieval practices at a specific company, covering the entire retrieval process and online serving, along with practical implications and challenges. By detailing the retrieval stage, which is fundamental for effective recommendation, this survey aims to bridge the existing knowledge gap and serve as a cornerstone for researchers interested in optimizing this critical component of cascade recommender systems.
翻译:在信息过载的时代,有效的推荐系统对于管理数字平台上的海量数据至关重要。多阶段级联排序系统在工业界被广泛使用,其中检索和排序是两个典型阶段。检索方法从海量候选项中筛选出无关项目,而排序方法则对这些候选项进行优先级排序,向用户呈现最相关的项目。与聚焦排序阶段的研究不同,本综述探讨了推荐系统中关键但常被忽视的检索阶段。为了实现精确且高效的个性化检索,我们从三个关键领域总结了现有工作:改进用户与项目之间的相似度计算、增强高效检索的索引机制以及优化检索的训练方法。我们还在三个公共数据集上进行了一系列全面的基准测试实验。此外,我们通过对特定公司检索实践的案例研究,重点介绍了当前的工业应用,涵盖了整个检索流程和在线服务,以及实际意义与挑战。通过详细阐述对有效推荐至关重要的检索阶段,本综述旨在弥合现有的知识鸿沟,并为有兴趣优化级联推荐系统这一关键组件的研究人员提供基石。