Sliced inverse regression (SIR, Li 1991) is a pioneering work and the most recognized method in sufficient dimension reduction. While promising progress has been made in theory and methods of high-dimensional SIR, two remaining challenges are still nagging high-dimensional multivariate applications. First, choosing the number of slices in SIR is a difficult problem, and it depends on the sample size, the distribution of variables, and other practical considerations. Second, the extension of SIR from univariate response to multivariate is not trivial. Targeting at the same dimension reduction subspace as SIR, we propose a new slicing-free method that provides a unified solution to sufficient dimension reduction with high-dimensional covariates and univariate or multivariate response. We achieve this by adopting the recently developed martingale difference divergence matrix (MDDM, Lee & Shao 2018) and penalized eigen-decomposition algorithms. To establish the consistency of our method with a high-dimensional predictor and a multivariate response, we develop a new concentration inequality for sample MDDM around its population counterpart using theories for U-statistics, which may be of independent interest. Simulations and real data analysis demonstrate the favorable finite sample performance of the proposed method.
翻译:切片逆回归(SIR, Li 1991)是充分降维领域的开创性工作及最广泛认可的方法。尽管高维SIR在理论与方法上取得了显著进展,但仍有两个挑战困扰着高维多元应用:首先,SIR中切片数量的选择是一个难题,其依赖于样本量、变量分布及其他实际因素;其次,将SIR从单变量响应扩展至多变量响应并非易事。针对与SIR相同的降维子空间,我们提出了一种无需切片的新方法,为高维协变量及单变量或多变量响应下的充分降维提供了统一解决方案。该方法通过采用近期提出的鞅差散度矩阵(MDDM, Lee & Shao 2018)及惩罚特征分解算法实现。为建立本方法在高维预测变量与多元响应下的一致性,我们利用U-统计量理论推导了样本MDDM与其总体版本的新的浓度不等式,该结果可能具有独立研究价值。模拟实验与实际数据分析表明,所提方法在有限样本下具有优异性能。