We present LiRank, a large-scale ranking framework at LinkedIn that brings to production state-of-the-art modeling architectures and optimization methods. We unveil several modeling improvements, including Residual DCN, which adds attention and residual connections to the famous DCNv2 architecture. We share insights into combining and tuning SOTA architectures to create a unified model, including Dense Gating, Transformers and Residual DCN. We also propose novel techniques for calibration and describe how we productionalized deep learning based explore/exploit methods. To enable effective, production-grade serving of large ranking models, we detail how to train and compress models using quantization and vocabulary compression. We provide details about the deployment setup for large-scale use cases of Feed ranking, Jobs Recommendations, and Ads click-through rate (CTR) prediction. We summarize our learnings from various A/B tests by elucidating the most effective technical approaches. These ideas have contributed to relative metrics improvements across the board at LinkedIn: +0.5% member sessions in the Feed, +1.76% qualified job applications for Jobs search and recommendations, and +4.3% for Ads CTR. We hope this work can provide practical insights and solutions for practitioners interested in leveraging large-scale deep ranking systems.
翻译:本文介绍LinkedIn的大规模排序框架LiRank,该系统将前沿建模架构与优化方法投入实际生产。我们揭示了多项建模改进,包括在著名的DCNv2架构中引入注意力机制与残差连接的Residual DCN。我们分享了整合并调优SOTA架构(包括Dense Gating、Transformer和Residual DCN)以构建统一模型的洞见。同时提出了新颖的校准技术,并阐述了如何将基于深度学习的探索/利用方法投入生产。为实现大规模排序模型的高效生产级服务,我们详细说明了如何通过量化和词表压缩技术进行模型训练与压缩。针对信息流排序、职位推荐和广告点击率(CTR)预测等大规模应用场景,我们提供了具体部署方案。通过阐明最有效的技术路径,我们总结了从多轮A/B测试中获得的经验。这些创新为LinkedIn各项核心指标带来全面提升:信息流成员会话量相对提升0.5%,职位搜索与推荐的合格申请量相对提升1.76%,广告点击率相对提升4.3%。我们希望这项工作能为关注大规模深度排序系统应用的研究者提供实践洞见与解决方案。