We observe an unknown regression function of $d$ variables $f(\boldsymbol{t})$, $\boldsymbol{t} \in[0,1]^d$, in the Gaussian white noise model of intensity $\varepsilon>0$. We assume that the function $f$ is regular and that it is a sum of $k$-variate functions, where $k$ varies from $1$ to $s$ ($1\leq s\leq d$). These functions are unknown to us and only few of them are nonzero. In this article, we address the problem of identifying the nonzero components of $f$ in the case when $d=d_\varepsilon\to \infty$ as $\varepsilon\to 0$ and $s$ is either fixed or $s=s_\varepsilon\to \infty$, $s=o(d)$ as $\varepsilon\to \infty$. This may be viewed as a variable selection problem. We derive the conditions when exact variable selection in the model at hand is possible and provide a selection procedure that achieves this type of selection. The procedure is adaptive to a degree of model sparsity described by the sparsity parameter $\beta\in(0,1)$. We also derive conditions that make the exact variable selection impossible. Our results augment previous work in this area.
翻译:我们在强度为$\varepsilon>0$的高斯白噪声模型中观测一个$d$变量未知回归函数$f(\boldsymbol{t})$,其中$\boldsymbol{t} \in[0,1]^d$。假设函数$f$具有正则性,且可表示为$k$元函数之和($k$取值从$1$到$s$,满足$1\leq s\leq d$)。这些函数的具体形式未知,且仅少数分量非零。本文研究当$d=d_\varepsilon\to \infty$($\varepsilon\to 0$时)且$s$为固定值或$s=s_\varepsilon\to \infty$($\varepsilon\to \infty$时满足$s=o(d)$)的情形下,识别$f$中非零分量的方法。该问题可视为变量选择问题。我们推导了在当前模型中实现精确变量选择的条件,并提出一种能达到此类选择效果的选择算法。该算法能自适应于由稀疏参数$\beta\in(0,1)$刻画的模型稀疏程度。同时,我们也给出了导致精确变量选择不可行的条件。本研究结果是对该领域已有工作的补充。