The central space of a joint distribution $(\vX,Y)$ is the minimal subspace $\mathcal S$ such that $Y\perp\hspace{-2mm}\perp \vX \mid P_{\mathcal S}\vX$ where $P_{\mathcal S}$ is the projection onto $\mathcal S$. Sliced inverse regression (SIR), one of the most popular methods for estimating the central space, often performs poorly when the structural dimension $d=\operatorname{dim}\left( \mathcal S \right)$ is large (e.g., $\geqs 5$). In this paper, we demonstrate that the generalized signal-noise-ratio (gSNR) tends to be extremely small for a general multiple-index model when $d$ is large. Then we determine the minimax rate for estimating the central space over a large class of high dimensional distributions with a large structural dimension $d$ (i.e., there is no constant upper bound on $d$) in the low gSNR regime. This result not only extends the existing minimax rate results for estimating the central space of distributions with fixed $d$ to that with a large $d$, but also clarifies that the degradation in SIR performance is caused by the decay of signal strength. The technical tools developed here might be of independent interest for studying other central space estimation methods.
翻译:联合分布$(\vX,Y)$的中心空间是指最小子空间$\mathcal S$,使得$Y\perp\hspace{-2mm}\perp \vX \mid P_{\mathcal S}\vX$成立,其中$P_{\mathcal S}$为到$\mathcal S$上的投影。切片逆回归(SIR)作为估计中心空间最常用的方法之一,在结构维度$d=\operatorname{dim}\left( \mathcal S \right)$较大时(例如$\geqs 5$)往往表现不佳。本文证明,对于一般的多指标模型,当$d$较大时,广义信噪比(gSNR)趋于极小。进而我们确定了在低gSNR条件下,针对一大类具有大结构维度$d$(即$d$无常数上界)的高维分布,其中心空间估计的极小极大速率。该结果不仅将现有固定$d$分布中心空间估计的极小极大速率结论推广至大$d$情形,而且阐明了SIR性能退化是由信号强度衰减所致。本文发展的技术工具可能对研究其他中心空间估计方法具有独立参考价值。