A Simple yet Effective Framework for Active Learning to Rank

from arxiv, This paper is accepted to Machine Intelligence Research and a short version is presented in NeurIPS 2022 Workshop on Human in the Loop Learning

While China has become the biggest online market in the world with around 1 billion internet users, Baidu runs the world largest Chinese search engine serving more than hundreds of millions of daily active users and responding billions queries per day. To handle the diverse query requests from users at web-scale, Baidu has done tremendous efforts in understanding users' queries, retrieve relevant contents from a pool of trillions of webpages, and rank the most relevant webpages on the top of results. Among these components used in Baidu search, learning to rank (LTR) plays a critical role and we need to timely label an extremely large number of queries together with relevant webpages to train and update the online LTR models. To reduce the costs and time consumption of queries/webpages labeling, we study the problem of Activ Learning to Rank (active LTR) that selects unlabeled queries for annotation and training in this work. Specifically, we first investigate the criterion -- Ranking Entropy (RE) characterizing the entropy of relevant webpages under a query produced by a sequence of online LTR models updated by different checkpoints, using a Query-By-Committee (QBC) method. Then, we explore a new criterion namely Prediction Variances (PV) that measures the variance of prediction results for all relevant webpages under a query. Our empirical studies find that RE may favor low-frequency queries from the pool for labeling while PV prioritizing high-frequency queries more. Finally, we combine these two complementary criteria as the sample selection strategies for active learning. Extensive experiments with comparisons to baseline algorithms show that the proposed approach could train LTR models achieving higher Discounted Cumulative Gain (i.e., the relative improvement {\Delta}DCG4=1.38%) with the same budgeted labeling efforts.

翻译：尽管中国已成为全球最大的在线市场，拥有约10亿互联网用户，百度运营着全球最大的中文搜索引擎，每天服务数亿活跃用户并处理数十亿次查询请求。为了在网页规模下处理多样化的用户查询需求，百度在理解用户查询、从数万亿网页的池中检索相关内容，以及将最相关的网页排在结果前列方面付出了巨大努力。在百度搜索使用的这些组件中，排序学习（LTR）扮演着关键角色，我们需要及时标注大量查询及其相关网页来训练和更新在线LTR模型。为降低查询/网页标注的成本和时间消耗，本文研究了主动排序学习（active LTR）问题，该方法选择未标注的查询进行标注和训练。具体而言，我们首先探究了排序熵（RE）这一准则——它采用查询委员会（QBC）方法，刻画了由不同检查点更新的在线LTR模型序列对查询下相关网页产生的熵值。随后，我们探索了一个新准则，即预测方差（PV），它衡量查询下所有相关网页预测结果的方差。实证研究发现，RE倾向于选择池中的低频查询进行标注，而PV则优先选择高频查询。最后，我们将这两个互补准则结合作为样本选择策略用于主动学习。与基线算法对比的大量实验表明，在相同标注预算下，所提出的方法能够训练出具有更高折损累计增益（即相对改进值ΔDCG4=1.38%）的LTR模型。