In online ranking, a learning algorithm sequentially ranks a set of items and receives feedback on its ranking in the form of relevance scores. Since obtaining relevance scores typically involves human annotation, it is of great interest to consider a partial feedback setting where feedback is restricted to the top-$k$ items in the rankings. Chaudhuri and Tewari [2017] developed a framework to analyze online ranking algorithms with top $k$ feedback. A key element in their work was the use of techniques from partial monitoring. In this paper, we further investigate online ranking with top $k$ feedback and solve some open problems posed by Chaudhuri and Tewari [2017]. We provide a full characterization of minimax regret rates with the top $k$ feedback model for all $k$ and for the following ranking performance measures: Pairwise Loss, Discounted Cumulative Gain, and Precision@n. In addition, we give an efficient algorithm that achieves the minimax regret rate for Precision@n.
翻译:在线排名中,学习算法依次对一组项目进行排序,并接收关于其排序的相关性评分反馈。由于获取相关性评分通常需要人工标注,因此考虑仅对排名中前$k$个项目进行反馈的部分反馈设置具有重要研究价值。Chaudhuri和Tewari [2017] 构建了分析top-k反馈下在线排名算法的框架,其关键创新在于采用了部分监控技术。本文进一步研究了top-k反馈下的在线排名问题,解决了Chaudhuri和Tewari [2017] 提出的若干未决问题。我们针对所有$k$值及以下排名性能指标,完整刻画了top-k反馈模型下的最小最大遗憾率:成对损失、折损累积增益和Precision@n。此外,我们提出了一种能实现Precision@n最小最大遗憾率的高效算法。