Partial Least Squares (PLS) regression emerged as an alternative to ordinary least squares for addressing multicollinearity in a wide range of scientific applications. As multidimensional tensor data is becoming more widespread, tensor adaptations of PLS have been developed. Our investigations reveal that the previously established asymptotic result of the PLS estimator for a tensor response breaks down as the tensor dimensions and the number of features increase relative to the sample size. To address this, we propose Sparse Higher Order Partial Least Squares (SHOPS) regression and an accompanying algorithm. SHOPS simultaneously accommodates variable selection, dimension reduction, and tensor association denoising. We establish the asymptotic accuracy of the SHOPS algorithm under a high-dimensional regime and verify these results through comprehensive simulation experiments, and applications to two contemporary high-dimensional biological data analysis.
翻译:偏最小二乘(PLS)回归作为普通最小二乘法的替代方法,在众多科学应用中用于解决多重共线性问题。随着多维张量数据日益普及,针对张量数据的PLS扩展方法已被开发。我们的研究发现,当张量维度和特征数相对于样本量增加时,先前建立的张量响应PLS估计量的渐近性结果会失效。为解决该问题,我们提出稀疏高阶偏最小二乘(SHOPS)回归及其配套算法。SHOPS能够同步实现变量选择、降维和张量关联去噪。我们建立了SHOPS算法在高维情形下的渐近精度,并通过全面的模拟实验以及两项当代高维生物学数据分析验证了这些结果。