This paper is devoted to the problem of determining the concentration bounds that are achievable in non-parametric regression. We consider the setting where features are supported on a bounded subset of $\mathbb{R}^d$, the regression function is Lipschitz, and the noise is only assumed to have a finite second moment. We first specify the fundamental limits of the problem by establishing a general lower bound on deviation probabilities, and then construct explicit estimators that achieve this bound. These estimators are obtained by applying the median-of-means principle to classical local averaging rules in non-parametric regression, including nearest neighbors and kernel procedures.
翻译:本文致力于研究非参数回归中可实现的集中界确定问题。我们考虑特征支撑于 $\mathbb{R}^d$ 有界子集、回归函数满足 Lipschitz 连续性、且噪声仅具有有限二阶矩的设置。我们首先通过建立偏差概率的普适下界来阐明该问题的基本极限,随后构建能够达到该下界的显式估计量。这些估计量是通过将均值中位数原理应用于经典的非参数回归局部平均规则(包括最近邻法与核方法)而得到的。