A bottleneck of sufficient dimension reduction (SDR) in the modern era is that, among numerous methods, only the sliced inverse regression (SIR) is generally applicable under the high-dimensional settings. The higher-order inverse regression methods, which form a major family of SDR methods that are superior to SIR in the population level, suffer from the dimensionality of their intermediate matrix-valued parameters that have an excessive number of columns. In this paper, we propose the generic idea of using a small subset of columns of the matrix-valued parameter for SDR estimation, which breaks the convention of using the ambient matrix for the higher-order inverse regression methods. With the aid of a quick column selection procedure, we then generalize these methods as well as their ensembles towards sparsity under the ultrahigh-dimensional settings, in a uniform manner that resembles sparse SIR and without additional assumptions. This is the first promising attempt in the literature to free the higher-order inverse regression methods from their dimensionality, which facilitates the applicability of SDR. The gain of column selection with respect to SDR estimation efficiency is also studied under the fixed-dimensional settings. Simulation studies and a real data example are provided at the end.
翻译:充分降维方法在现代面临的一个瓶颈是,在众多方法中,只有切片逆回归在高维场景下具有广泛适用性。高阶逆回归方法作为一类在总体层面上优于切片逆回归的重要充分降维方法,其性能受限于中间矩阵值参数(具有过多列数)的维度。本文提出利用矩阵值参数的少量列子集进行充分降维估计的通用思想,突破了高阶逆回归方法使用全局矩阵的惯例。借助快速列选择程序,我们以类似稀疏切片逆回归的统一方式,在无额外假设的条件下将这些方法及其集成方法推广至超高维稀疏场景。这是文献中首次尝试将高阶逆回归方法从维度困境中解放出来,从而提升充分降维的适用性。在固定维度场景下,我们还研究了列选择对充分降维估计效率的提升作用。最后通过仿真实验和实际数据案例验证了方法的有效性。