In multi-vertical e-commerce platforms like DoorDash, relatively newer product verticals such as grocery and retail present a significant opportunity for personalization innovation. A key challenge lies in solving the "cold start" problem for users. This paper introduces a novel framework for enhancing recommendation quality by transferring knowledge from data-rich verticals (e.g., restaurants at DoorDash) to data-sparse ones. We leverage Large Language Models (LLMs) to perform generative inference, synthesizing sparse, high-dimensional features that encapsulate latent user affinities. Specifically, we employ a hierarchical Retrieval-Augmented Generation (RAG) pipeline to derive multi-level taxonomic features from user restaurant order histories and search queries. These generated features, encoding both long-term cross-vertical preferences and short-term intent, are integrated into a production Multi-Task Learning (MTL) ranking model. We demonstrate through extensive offline and online evaluation that this approach significantly improves personalization and engagement in emerging business verticals, effectively bridging the behavioral data gap.
翻译:在多垂直领域电商平台(如DoorDash)中,相对较新的产品垂直领域(如杂货和零售)为个性化创新提供了重要机遇。其中一大关键挑战在于解决用户的“冷启动”问题。本文提出了一种新颖框架,通过将数据丰富垂直领域(例如DoorDash的餐饮业务)的知识迁移至数据稀疏垂直领域,从而提升推荐质量。我们利用大语言模型执行生成式推理,合成编码用户潜在偏好的稀疏高维特征。具体而言,我们采用层级式检索增强生成流水线,从用户餐厅订单历史和搜索查询中推导出多层级分类特征。这些生成的特征同时编码了长期跨垂直领域偏好与短期意图,并被集成到生产环境中的多任务学习排序模型中。通过广泛的离线和在线评估,我们证明该方法能显著提升新兴业务领域的个性化效果与用户参与度,有效弥合了行为数据差距。