Partial Least Squares (PLS) refer to a class of dimension-reduction techniques aiming at the identification of two sets of components with maximal covariance, to model the relationship between two sets of observed variables $x\in\mathbb{R}^p$ and $y\in\mathbb{R}^q$, with $p\geq 1, q\geq 1$. Probabilistic formulations have recently been proposed for several versions of the PLS. Focusing first on the probabilistic formulation of the PLS-SVD proposed by el Bouhaddani et al., we establish that the constraints on their model parameters are too restrictive and define particular distributions for $(x,y)$, under which components with maximal covariance (solutions of PLS-SVD) are also necessarily of respective maximal variances (solutions of principal components analyses of $x$ and $y$, respectively). We propose an alternative probabilistic formulation of PLS-SVD, no longer restricted to these particular distributions. We then present numerical illustrations of the limitation of the original model of el Bouhaddani et al. We also briefly discuss similar limitations in another latent variable model for dimension-reduction.
翻译:偏最小二乘(PLS)是一类降维技术,旨在识别两组具有最大协方差的分量,以建模两组观测变量$x\in\mathbb{R}^p$和$y\in\mathbb{R}^q$(其中$p\geq 1, q\geq 1$)之间的关系。近期已有针对PLS若干版本的概率化公式提出。本文首先聚焦于el Bouhaddani等人提出的PLS-SVD概率化公式,指出其模型参数约束过于严格,仅定义了$(x,y)$的特定分布;在此分布下,具有最大协方差的分量(PLS-SVD的解)必然同时具有各自的最大方差(分别为$x$和$y$主成分分析的解)。我们提出了PLS-SVD的另一种概率化公式,不再受限于这些特定分布。随后通过数值示例阐明el Bouhaddani等人原始模型的局限性。此外,我们还简要讨论了另一个用于降维的潜变量模型中存在的类似局限性。