Principal component analysis (PCA), along with its extensions to manifolds and outlier contaminated data, have been indispensable in computer vision and machine learning. In this work, we present a unifying formalism for PCA and its variants, and introduce a framework based on the flags of linear subspaces, ie a hierarchy of nested linear subspaces of increasing dimension, which not only allows for a common implementation but also yields novel variants, not explored previously. We begin by generalizing traditional PCA methods that either maximize variance or minimize reconstruction error. We expand these interpretations to develop a wide array of new dimensionality reduction algorithms by accounting for outliers and the data manifold. To devise a common computational approach, we recast robust and dual forms of PCA as optimization problems on flag manifolds. We then integrate tangent space approximations of principal geodesic analysis (tangent-PCA) into this flag-based framework, creating novel robust and dual geodesic PCA variations. The remarkable flexibility offered by the 'flagification' introduced here enables even more algorithmic variants identified by specific flag types. Last but not least, we propose an effective convergent solver for these flag-formulations employing the Stiefel manifold. Our empirical results on both real-world and synthetic scenarios, demonstrate the superiority of our novel algorithms, especially in terms of robustness to outliers on manifolds.
翻译:主成分分析(PCA)及其在流形与异常值污染数据上的扩展,在计算机视觉与机器学习中不可或缺。本文提出一种统一化的PCA及其变体形式体系,并引入基于线性子空间旗(即维度递增的嵌套线性子空间层次结构)的框架,该框架不仅支持统一实现,还能产生此前未探索的新颖变体。我们首先推广了传统PCA方法(最大化方差或最小化重构误差),通过考虑异常值与数据流形,将这些解释扩展以开发一系列新的降维算法。为设计通用计算方法,我们将PCA的鲁棒形式与对偶形式重新表述为旗流形上的优化问题,进而将主测地线分析的切空间近似(切空间PCA)整合至基于旗的框架中,创建了新颖的鲁棒与对偶测地线PCA变体。本文引入的"旗化"技术展现出卓越灵活性,可通过特定旗类型衍生出更多算法变体。最后,我们提出一种基于施蒂费尔流形的有效收敛求解器用于这些旗形式化问题。在真实场景与合成数据上的实验结果表明,新算法具有显著优越性,尤其在流形上对异常值的鲁棒性方面表现突出。