GRank: Towards Target-Aware and Streamlined Industrial Retrieval with a Generate-Rank Framework

Industrial-scale recommender systems rely on a cascade pipeline in which the retrieval stage must return a high-recall candidate set from billions of items under tight latency. Existing solutions ei- ther (i) suffer from limited expressiveness in capturing fine-grained user-item interactions, as seen in decoupled dual-tower architectures that rely on separate encoders, or generative models that lack precise target-aware matching capabilities, or (ii) build structured indices (tree, graph, quantization) whose item-centric topologies struggle to incorporate dynamic user preferences and incur prohibitive construction and maintenance costs. We present GRank, a novel structured-index-free retrieval paradigm that seamlessly unifies target-aware learning with user-centric retrieval. Our key innovations include: (1) A target-aware Generator trained to perform personalized candidate generation via GPU-accelerated MIPS, eliminating semantic drift and maintenance costs of structured indexing; (2) A lightweight but powerful Ranker that performs fine-grained, candidate-specific inference on small subsets; (3) An end-to-end multi-task learning framework that ensures semantic consistency between generation and ranking objectives. Extensive experiments on two public benchmarks and a billion-item production corpus demonstrate that GRank improves Recall@500 by over 30% and 1.7$\times$ the P99 QPS of state-of-the-art tree- and graph-based retrievers. GRank has been fully deployed in production in our recommendation platform since Q2 2025, serving 400 million active users with 99.95% service availability. Online A/B tests confirm significant improvements in core engagement metrics, with Total App Usage Time increasing by 0.160% in the main app and 0.165% in the Lite version.

翻译：工业级推荐系统依赖级联管道，其中检索阶段必须在严苛时延约束下从数十亿候选项返回高召回率候选集。现有方案或存在以下局限：(i) 缺乏捕捉细粒度用户-物品交互的表示能力——如依赖独立编码器的解耦双塔架构，或缺乏精确目标感知匹配能力的生成模型；(ii) 构建结构化索引（树、图、量化）时，其以物品为中心的拓扑结构难以融入动态用户偏好，且面临高昂的构建与维护成本。我们提出GRank——一种无需结构化索引的新型检索范式，将目标感知学习与用户中心化检索无缝统一。核心创新包括：(1) 面向目标感知的生成器，通过GPU加速的MIPS实现个性化候选生成，消除语义漂移与结构化索引维护成本；(2) 轻量而强大的排序器，对小规模子集执行细粒度、候选感知的推断；(3) 端到端多任务学习框架，确保生成与排序目标的语义一致性。在两个公开基准与一个十亿级物品生产语料库上的大量实验表明，GRank将Recall@500提升超30%，P99 QPS达到现有最优树/图检索器的1.7倍。自2025年第二季度起，GRank已全面部署于我们的推荐平台生产环境，服务4亿活跃用户并保持99.95%服务可用性。线上A/B测试证实核心参与指标显著提升，主应用与精简版总应用使用时长分别提升0.160%与0.165%。