In this paper, we are concerned with the generalization performance of non-parametric estimation for pairwise learning. Most of the existing work requires the hypothesis space to be convex or a VC-class, and the loss to be convex. However, these restrictive assumptions limit the applicability of the results in studying many popular methods, especially kernel methods and neural networks. We significantly relax these restrictive assumptions and establish a sharp oracle inequality of the empirical minimizer with a general hypothesis space for the Lipschitz continuous pairwise losses. Our results can be used to handle a wide range of pairwise learning problems including ranking, AUC maximization, pairwise regression, and metric and similarity learning. As an application, we apply our general results to study pairwise least squares regression and derive an excess generalization bound that matches the minimax lower bound for pointwise least squares regression up to a logrithmic term. The key novelty here is to construct a structured deep ReLU neural network as an approximation of the true predictor and design the targeted hypothesis space consisting of the structured networks with controllable complexity. This successful application demonstrates that the obtained general results indeed help us to explore the generalization performance on a variety of problems that cannot be handled by existing approaches.
翻译:本文研究成对学习中非参数估计的泛化性能。现有工作大多要求假设空间为凸集或VC类,且损失函数为凸函数。然而这些限制性假设制约了相关结论在研究许多流行方法(特别是核方法与神经网络)时的适用性。我们显著放宽了这些限制性假设,针对Lipschitz连续成对损失函数,在一般假设空间下建立了经验最小化算子的尖锐甲骨文不等式。我们的结果可适用于包括排序、AUC最大化、成对回归以及度量与相似度学习在内的广泛成对学习问题。作为应用,我们将一般性结论用于研究成对最小二乘回归,推导出的超额泛化误差界与逐点最小二乘回归的极小极大下界仅相差对数项。本研究的核心创新在于:通过构建结构化深度ReLU神经网络来逼近真实预测器,并设计由复杂度可控的结构化网络组成的目标假设空间。这一成功应用表明,所获得的一般性结果确实有助于探索现有方法无法处理的各类问题的泛化性能。