In the realm of search systems, multi-stage cascade architecture is a prevalent method, typically consisting of sequential modules such as matching, pre-ranking, and ranking. It is generally acknowledged that the model used in the pre-ranking stage must strike a balance between efficacy and efficiency. Thus, the most commonly employed architecture is the representation-focused vector product based model. However, this architecture lacks effective interaction between the query and document, resulting in a reduction in the effectiveness of the search system. To address this issue, we present a novel pre-ranking framework called RankDFM. Our framework leverages DeepFM as the backbone and employs a pairwise training paradigm to learn the ranking of videos under a query. The capability of RankDFM to cross features provides significant improvement in offline and online A/B testing performance. Furthermore, we introduce a learnable feature selection scheme to optimize the model and reduce the time required for online inference, equivalent to a tree model. Currently, RankDFM has been deployed in the search system of a shortvideo App, providing daily services to hundreds of millions users.
翻译:在搜索系统领域,多阶段级联架构是一种主流方法,通常包含匹配、预排序和排序等顺序模块。普遍认为,预排序阶段使用的模型必须在效果与效率之间取得平衡。因此,最常用的架构是基于向量积的表示型模型。然而,该架构缺乏查询与文档之间的有效交互,导致搜索系统效果下降。为解决这一问题,我们提出了一种新型预排序框架 RankDFM。该框架以 DeepFM 为骨干网络,并采用成对训练范式来学习查询下的视频排序。RankDFM 的特征交叉能力在离线及在线 A/B 测试中均带来了显著的性能提升。此外,我们引入了一种可学习的特征选择机制,以优化模型并降低在线推理时间,使其与树模型相当。目前,RankDFM 已部署于某短视频应用的搜索系统中,每天为亿万用户提供服务。