Expected values weighted by the inverse of a multivariate density or, equivalently, Lebesgue integrals of regression functions with multivariate regressors occur in various areas of applications, including estimating average treatment effects, nonparametric estimators in random coefficient regression models or deconvolution estimators in Berkson errors-in-variables models. The frequently used nearest-neighbor and matching estimators suffer from bias problems in multiple dimensions. By using polynomial least squares fits on each cell of the $K^{\text{th}}$-order Voronoi tessellation for sufficiently large $K$, we develop novel modifications of nearest-neighbor and matching estimators which again converge at the parametric $\sqrt n $-rate under mild smoothness assumptions on the unknown regression function and without any smoothness conditions on the unknown density of the covariates. We stress that in contrast to competing methods for correcting for the bias of matching estimators, our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent smoothing parameters. We complement the upper bounds with appropriate lower bounds derived from information-theoretic arguments, which show that some smoothness of the regression function is indeed required to achieve the parametric rate. Simulations illustrate the practical feasibility of the proposed methods.
翻译:通过多元密度倒数加权的期望值,或等价地,具有多元回归量的回归函数的勒贝格积分,出现在多个应用领域,包括估计平均处理效应、随机系数回归模型中的非参数估计量,或Berkson变量误差模型中的去卷积估计量。常用的最近邻与匹配估计量在多维情况下存在偏差问题。通过在足够大的$K$阶Voronoi剖分的每个单元上使用多项式最小二乘拟合,我们提出了最近邻与匹配估计量的新颖修正方法,这些修正估计量在未知回归函数满足温和光滑性假设且无需协变量未知密度光滑性条件的情况下,仍能以参数化$\sqrt n$速率收敛。我们强调,与校正匹配估计量偏差的竞争方法不同,我们的估计量不涉及非参数函数估计量,尤其不依赖于样本量相关的光滑参数。我们通过信息论论证推导出相应的下界,补充了上界结果,表明回归函数的一定光滑性确实是达到参数化速率所必需的。仿真实验验证了所提方法的实际可行性。