In this paper, we propose a generic model-based re-ranking framework, MultiSlot ReRanker, which simultaneously optimizes relevance, diversity, and freshness. Specifically, our Sequential Greedy Algorithm (SGA) is efficient enough (linear time complexity) for large-scale production recommendation engines. It achieved a lift of $+6\%$ to $ +10\%$ offline Area Under the receiver operating characteristic Curve (AUC) which is mainly due to explicitly modeling mutual influences among items of a list, and leveraging the second pass ranking scores of multiple objectives. In addition, we have generalized the offline replay theory to multi-slot re-ranking scenarios, with trade-offs among multiple objectives. The offline replay results can be further improved by Pareto Optimality. Moreover, we've built a multi-slot re-ranking simulator based on OpenAI Gym integrated with the Ray framework. It can be easily configured for different assumptions to quickly benchmark both reinforcement learning and supervised learning algorithms.
翻译:本文提出了一种基于模型的通用重排框架——多槽重排器,该框架可同时优化相关性、多样性和新颖性。具体而言,我们的序列贪婪算法(SGA)具有线性时间复杂度,能够高效支持大规模生产级推荐引擎。该算法通过显式建模列表内项目间的相互影响,并利用多个目标的二次排序分数,将离线接收者操作特征曲线下面积(AUC)提升了6%至10%。此外,我们将离线回放理论推广至多槽重排场景,实现了多目标间的权衡。通过帕累托最优性可进一步改进离线回放结果。同时,我们基于OpenAI Gym框架构建了集成Ray框架的多槽重排模拟器,该模拟器可针对不同假设灵活配置,从而快速对强化学习与监督学习算法进行基准测试。