We consider a ranking problem where we have noisy observations from a matrix with isotonic columns whose rows have been permuted by some permutation $\pi$ *. This encompasses many models, including crowd-labeling and ranking in tournaments by pair-wise comparisons. In this work, we provide an optimal and polynomial-time procedure for recovering $\pi$ * , settling an open problem in [7]. As a byproduct, our procedure is used to improve the state-of-the art for ranking problems in the stochastically transitive model (SST). Our approach is based on iterative pairwise comparisons by suitable data-driven weighted means of the columns. These weights are built using a combination of spectral methods with new dimension-reduction techniques. In order to deal with the important case of missing data, we establish a new concentration inequality for sparse and centered rectangular Wishart-type matrices.
翻译:本文考虑一类排序问题:我们观测到具有等调列的矩阵中的含噪数据,其行被某个置换π*重新排列。该框架涵盖众多模型,包括众包标注及通过成对比较的锦标赛排序。本研究提出一种最优且多项式时间复杂度的程序来恢复π*,解决了文献[7]中的公开问题。作为副产品,该程序可用于改进随机传递模型(SST)中排序问题的现有最优方法。我们的方法基于通过数据驱动的列加权均值进行迭代成对比较,这些权重结合了谱方法与新型降维技术。为应对重要的数据缺失情况,我们为稀疏中心化矩形Wishart型矩阵建立了新的集中不等式。