A bottleneck of sufficient dimension reduction (SDR) in the modern era is that, among numerous methods, only the sliced inverse regression (SIR) is generally applicable under the high-dimensional settings. The higher-order inverse regression methods, which form a major family of SDR methods that are superior to SIR in the population level, suffer from the dimensionality of their intermediate matrix-valued parameters that have an excessive number of columns. In this paper, we propose the generic idea of using a small subset of columns of the matrix-valued parameter for SDR estimation, which breaks the convention of using the ambient matrix for the higher-order inverse regression methods. With the aid of a quick column selection procedure, we then generalize these methods as well as their ensembles towards sparsity under the ultrahigh-dimensional settings, in a uniform manner that resembles sparse SIR and without additional assumptions. This is the first promising attempt in the literature to free the higher-order inverse regression methods from their dimensionality, which facilitates the applicability of SDR. The gain of column selection with respect to SDR estimation efficiency is also studied under the fixed-dimensional settings. Simulation studies and a real data example are provided at the end.
翻译:在现代高维背景下,充分降维(SDR)面临的一个瓶颈是:在众多方法中,仅有切片逆回归(SIR)通常适用于高维场景。高阶逆回归方法作为SDR方法的一个主要分支,其在总体水平上优于SIR,但其矩阵值参数的维度问题——即这些参数包含过多列数——限制了其应用。本文提出一种通用思路:利用矩阵值参数的一个小子集进行SDR估计,从而打破了高阶逆回归方法传统上使用完整矩阵的惯例。借助一种快速的列选择程序,我们以类似于稀疏SIR的均匀方式,将这些方法及其集成推广至超高维稀疏场景,且无需额外假设。这是文献中首次成功尝试将高阶逆回归方法从维度限制中解放出来,从而提升了SDR的适用性。本文亦在固定维度设置下研究了列选择对SDR估计效率的提升。文末提供了模拟研究和真实数据示例。