Nowadays many real-world datasets can be considered as functional, in the sense that the processes which generate them are continuous. A fundamental property of this type of data is that in theory they belong to an infinite-dimensional space. Although in practice we usually receive finite observations, they are still high-dimensional and hence dimensionality reduction methods are crucial. In this vein, the main state-of-the-art method for functional data analysis is Functional PCA. Nevertheless, this classic technique assumes that the data lie in a linear manifold, and hence it could have problems when this hypothesis is not fulfilled. In this research, attention has been placed on a non-linear manifold learning method: Diffusion Maps. The article explains how to extend this multivariate method to functional data and compares its behavior against Functional PCA over different simulated and real examples.
翻译:如今,许多现实世界的数据集可被视为函数型数据,即生成它们的过程具有连续性。这类数据的一个基本属性是理论上属于无穷维空间。尽管实践中通常只能获得有限观测值,但数据仍具有高维特征,因此降维方法至关重要。在此背景下,当前函数型数据分析的主流方法是函数型主成分分析。然而,该经典技术假设数据位于线性流形中,因此当该假设不成立时可能存在问题。本研究聚焦于一种非线性流形学习方法:扩散映射。本文阐述了如何将该多变量方法扩展至函数型数据,并通过模拟与真实案例将其与函数型主成分分析进行对比分析。