We develop scalable manifold learning methods and theory, motivated by the problem of estimating manifold of fMRI activation in the Human Connectome Project (HCP). We propose the Fast Graph Laplacian Estimation for Heat Kernel Gaussian Processes (FLGP) in the natural exponential family model. FLGP handles large sample sizes $ n $, preserves the intrinsic geometry of data, and significantly reduces computational complexity from $ \mathcal{O}(n^3) $ to $ \mathcal{O}(n) $ via a novel reduced-rank approximation of the graph Laplacian's transition matrix and truncated Singular Value Decomposition for eigenpair computation. Our numerical experiments demonstrate FLGP's scalability and improved accuracy for manifold learning from large-scale complex data.
翻译:我们发展了可扩展的流形学习方法与理论,其动机源于人类连接组计划(HCP)中功能磁共振成像激活流形的估计问题。我们提出了适用于自然指数族模型的快速图拉普拉斯估计热核高斯过程(FLGP)。FLGP能够处理大样本量 $ n $,保持数据的固有几何结构,并通过图拉普拉斯转移矩阵的新型低秩近似以及用于特征对计算的截断奇异值分解,将计算复杂度从 $ \mathcal{O}(n^3) $ 显著降低至 $ \mathcal{O}(n) $。我们的数值实验证明了FLGP在处理大规模复杂数据进行流形学习时的可扩展性与更高的准确性。