Analyzing data in non-Euclidean spaces, such as bioinformatics, biology, and geology, where variables represent directions or angles, poses unique challenges. This type of data is known as circular data in univariate cases and can be termed spherical or toroidal in multivariate contexts. In this paper, we introduce a novel extension of Probabilistic Principal Component Analysis (PPCA) designed for toroidal (or torus) data, termed Torus Probabilistic PCA (TPPCA). We provide detailed algorithms for implementing TPPCA and demonstrate its applicability to torus data. To assess the efficacy of TPPCA, we perform comparative analyses using a simulation study and three real datasets. Our findings highlight the advantages and limitations of TPPCA in handling torus data. Furthermore, we propose statistical tests based on likelihood ratio statistics to determine the optimal number of components, enhancing the practical utility of TPPCA for real-world applications.
翻译:在非欧几里得空间中分析数据(例如生物信息学、生物学和地质学等领域中变量表示方向或角度的情况)面临独特的挑战。此类数据在单变量情形下称为环形数据,在多变量背景下可称为球形或环面数据。本文提出了一种专为环面数据设计的概率主成分分析(PPCA)新扩展方法,称为环面概率主成分分析(TPPCA)。我们提供了实现TPPCA的详细算法,并论证了其在环面数据上的适用性。为评估TPPCA的有效性,我们通过模拟研究和三个真实数据集进行了比较分析。研究结果凸显了TPPCA在处理环面数据方面的优势与局限性。此外,我们提出了基于似然比统计量的统计检验方法以确定最佳成分数量,从而增强了TPPCA在实际应用中的实用性。