This article studies the derivatives in models that flexibly characterize the relationship between a response variable and multiple predictors, with goals of providing both accurate estimation and inference procedures for hypothesis testing. In the setting of tensor product reproducing spaces for nonparametric multivariate functions, we propose a plug-in kernel ridge regression estimator to estimate the derivatives of the underlying multivariate regression function under the smoothing spline ANOVA model. This estimator has an analytical form, making it simple to implement in practice. We first establish $L_\infty$ and $L_2$ convergence rates of the proposed estimator under general random designs. For derivatives with some selected interesting orders, we provide an in-depth analysis establishing the minimax lower bound, which matches the $L_2$ convergence rate. Additionally, motivated by a wide range of applications, we propose a hypothesis testing procedure to examine whether a derivative is zero. Theoretical results demonstrate that the proposed testing procedure achieves the correct size under the null hypothesis and is asymptotically powerful under local alternatives. For ease of use, we also develop an associated bootstrap algorithm to construct the rejection region and calculate the p-value, and the consistency of the proposed algorithm is established. Simulation studies using synthetic data and an application to a real-world dataset confirm the effectiveness of our methods.
翻译:本文研究灵活刻画响应变量与多个预测变量之间关系的模型中的导数,旨在为假设检验提供精确的估计和推断程序。在非参数多元函数的张量积再生空间框架下,我们提出了一种代入型核岭回归估计量,用于估计平滑样条ANOVA模型下潜在多元回归函数的导数。该估计量具有解析形式,便于实际应用。首先,我们在一般随机设计下建立了所提估计量的$L_\infty$和$L_2$收敛速率。针对某些选定阶数的导数,我们进行了深入分析,建立了与$L_2$收敛速率匹配的极小化最优下界。此外,受广泛应用场景的启发,我们提出了一种假设检验程序,用于检验导数是否为零。理论结果表明,所提检验程序在原假设下具有正确的检验水平,并在局部备择假设下具有渐近势。为便于使用,我们还开发了相关的bootstrap算法来构造拒绝域并计算p值,并证明了该算法的一致性。基于合成数据的模拟研究及实际数据集的应用程序验证了我们方法的有效性。