A major family of sufficient dimension reduction (SDR) methods, called inverse regression, commonly require the distribution of the predictor $X$ to have a linear $E(X|\beta^\mathsf{T}X)$ and a degenerate $\mathrm{var}(X|\beta^\mathsf{T}X)$ for the desired reduced predictor $\beta^\mathsf{T}X$. In this paper, we adjust the first and second-order inverse regression methods by modeling $E(X|\beta^\mathsf{T}X)$ and $\mathrm{var}(X|\beta^\mathsf{T}X)$ under the mixture model assumption on $X$, which allows these terms to convey more complex patterns and is most suitable when $X$ has a clustered sample distribution. The proposed SDR methods build a natural path between inverse regression and the localized SDR methods, and in particular inherit the advantages of both; that is, they are $\sqrt{n}$-consistent, efficiently implementable, directly adjustable under the high-dimensional settings, and fully recovering the desired reduced predictor. These findings are illustrated by simulation studies and a real data example at the end, which also suggest the effectiveness of the proposed methods for nonclustered data.
翻译:充分降维方法的一个主要家族——逆回归,通常要求预测变量 $X$ 的分布满足线性条件 $E(X|\beta^\mathsf{T}X)$ 和退化条件 $\mathrm{var}(X|\beta^\mathsf{T}X)$,以获得所需的降维预测变量 $\beta^\mathsf{T}X$。本文通过假设 $X$ 服从混合模型,对 $E(X|\beta^\mathsf{T}X)$ 和 $\mathrm{var}(X|\beta^\mathsf{T}X)$ 进行建模,调整了一阶和二阶逆回归方法。该模型使这些项能够传递更复杂的模式,尤其适用于 $X$ 具有聚类样本分布的情况。所提出的充分降维方法在逆回归与局部化充分降维方法之间构建了一条自然路径,并继承了二者的优势:具体而言,这些方法具有 $\sqrt{n}$ 一致性、高效可实现性、在高维设置下可直接调整,以及能够完全恢复所需的降维预测变量。最后,通过模拟研究和实际数据示例验证了上述发现,并表明所提方法对非聚类数据同样有效。