The lightweight ad ranking layer, living after the retrieval stage and before the fine ranker, plays a critical role in the success of a cascaded ad recommendation system. Due to the fact that there are multiple optimization tasks depending on the ad domain, e.g., Click Through Rate (CTR) for click ads and Conversion Rate (CVR) for conversion ads, as well as multiple surfaces where an ad is served (home feed, search, or related item recommendation) with diverse ad products (shopping or standard ad); it is an essentially challenging problem in industry on how to do joint holistic optimization in the lightweight ranker, such that the overall platform's value, advertiser's value, and user's value are maximized. Deep Neural Network (DNN)-based multitask learning (MTL) can handle multiple goals naturally, with each prediction head mapping to a particular optimization goal. However, in practice, it is unclear how to unify data from different surfaces and ad products into a single model. It is critical to learn domain-specialized knowledge and explicitly transfer knowledge between domains to make MTL effective. We present a Multi-Task Multi-Domain (MTMD) architecture under the classic Two-Tower paradigm, with the following key contributions: 1) handle different prediction tasks, ad products, and ad serving surfaces in a unified framework; 2) propose a novel mixture-of-expert architecture to learn both specialized knowledge each domain and common knowledge shared between domains; 3) propose a domain adaption module to encourage knowledge transfer between experts; 4) constrain the modeling of different prediction tasks. MTMD improves the offline loss value by 12% to 36%, mapping to 2% online reduction in cost per click. We have deployed this single MTMD framework into production for Pinterest ad recommendation replacing 9 production models.
翻译:轻量级广告排序层位于召回阶段之后、精排模型之前,在级联式广告推荐系统的成功中发挥着关键作用。由于存在依赖于广告领域的多个优化任务(例如点击广告的点击率(CTR)和转化广告的转化率(CVR)),以及广告投放的多种场景(首页信息流、搜索或相关物品推荐)与多样化的广告产品(购物广告或标准广告),如何在轻量级排序器中进行联合整体优化,以实现平台整体价值、广告主价值和用户价值的最大化,是工业界面临的一个本质性挑战。基于深度神经网络(DNN)的多任务学习(MTL)能够自然地处理多个目标,每个预测头对应一个特定的优化目标。然而,在实践中,如何将来自不同场景和广告产品的数据统一到单个模型中尚不明确。学习领域专有知识并显式地在领域间迁移知识对于MTL的有效性至关重要。我们在经典的双塔范式下提出了一种多任务多领域(MTMD)架构,其主要贡献包括:1)在统一框架中处理不同的预测任务、广告产品和广告投放场景;2)提出一种新颖的专家混合架构,以学习各领域的专有知识及领域间共享的通用知识;3)提出领域适应模块以促进专家间的知识迁移;4)约束不同预测任务的建模。MTMD将离线损失值降低了12%至36%,对应在线每次点击成本降低2%。我们已将此单一MTMD框架部署至Pinterest广告推荐的生产环境,替代了原有的9个生产模型。