On some limitations of probabilistic models for dimension-reduction: Illustration in the case of probabilistic formulations of partial least squares

Partial Least Squares (PLS) refer to a class of dimension-reduction techniques aiming at the identification of two sets of components with maximal covariance, to model the relationship between two sets of observed variables $x\in\mathbb{R}^p$ and $y\in\mathbb{R}^q$, with $p\geq 1, q\geq 1$. Probabilistic formulations have recently been proposed for several versions of the PLS. Focusing first on the probabilistic formulation of the PLS-SVD proposed by el Bouhaddani et al., we establish that the constraints on their model parameters are too restrictive and define particular distributions for $(x,y)$, under which components with maximal covariance (solutions of PLS-SVD) are also necessarily of respective maximal variances (solutions of principal components analyses of $x$ and $y$, respectively). We propose an alternative probabilistic formulation of PLS-SVD, no longer restricted to these particular distributions. We then present numerical illustrations of the limitation of the original model of el Bouhaddani et al. We also briefly discuss similar limitations in another latent variable model for dimension-reduction.

翻译：偏最小二乘（PLS）是一类降维技术，旨在识别两组具有最大协方差的分量，以建模两组观测变量$x\in\mathbb{R}^p$和$y\in\mathbb{R}^q$（其中$p\geq 1, q\geq 1$）之间的关系。近期已有针对PLS若干版本的概率化公式提出。本文首先聚焦于el Bouhaddani等人提出的PLS-SVD概率化公式，指出其模型参数约束过于严格，仅定义了$(x,y)$的特定分布；在此分布下，具有最大协方差的分量（PLS-SVD的解）必然同时具有各自的最大方差（分别为$x$和$y$主成分分析的解）。我们提出了PLS-SVD的另一种概率化公式，不再受限于这些特定分布。随后通过数值示例阐明el Bouhaddani等人原始模型的局限性。此外，我们还简要讨论了另一个用于降维的潜变量模型中存在的类似局限性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【硬核书】树与网络上的概率，716页pdf

专知会员服务

77+阅读 · 2021年12月8日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

434+阅读 · 2021年1月11日