Kernel methods are widely used in machine learning, especially for classification problems. However, the theoretical analysis of kernel classification is still limited. This paper investigates the statistical performances of kernel classifiers. With some mild assumptions on the conditional probability $\eta(x)=\mathbb{P}(Y=1\mid X=x)$, we derive an upper bound on the classification excess risk of a kernel classifier using recent advances in the theory of kernel regression. We also obtain a minimax lower bound for Sobolev spaces, which shows the optimality of the proposed classifier. Our theoretical results can be extended to the generalization error of overparameterized neural network classifiers. To make our theoretical results more applicable in realistic settings, we also propose a simple method to estimate the interpolation smoothness of $2\eta(x)-1$ and apply the method to real datasets.
翻译:核方法广泛应用于机器学习,尤其在分类问题中。然而,关于核分类的理论分析仍然有限。本文研究了核分类器的统计性能。在条件概率$\eta(x)=\mathbb{P}(Y=1\mid X=x)$的温和假设下,我们利用核回归理论的最新进展,推导了核分类器分类超额风险的上界。我们还得到了索伯列夫空间中的极小化下界,证明了所提出分类器的最优性。我们的理论结果可推广至过参数化神经网络分类器的泛化误差。为使理论结果更适用于实际场景,我们还提出了一种估计$2\eta(x)-1$插值光滑性的简单方法,并将其应用于真实数据集。