Heterogeneous functional data are commonly seen in time series and longitudinal data analysis. To capture the statistical structures of such data, we propose the framework of Functional Singular Value Decomposition (FSVD), a unified framework with structure-adaptive interpretability for the analysis of heterogeneous functional data. We establish the mathematical foundation of FSVD by proving its existence and providing its fundamental properties using operator theory. We then develop an implementation approach for noisy and irregularly observed functional data based on a novel joint kernel ridge regression scheme and provide theoretical guarantees for its convergence and estimation accuracy. The framework of FSVD also introduces the concepts of intrinsic basis functions and intrinsic basis vectors, which represent two fundamental statistical structures for random functions and connect FSVD to various tasks including functional principal component analysis, factor models, functional clustering, and functional completion. We compare the performance of FSVD with existing methods in several tasks through extensive simulation studies. To demonstrate the value of FSVD in real-world datasets, we apply it to extract temporal patterns from a COVID-19 case count dataset and perform data completion on an electronic health record dataset.
翻译:异构函数数据在时间序列和纵向数据分析中十分常见。为捕捉此类数据的统计结构,我们提出了功能奇异值分解(FSVD)框架,这是一个具有结构自适应可解释性的统一框架,用于分析异构函数数据。我们通过算子理论证明了FSVD的存在性并阐述了其基本性质,从而建立了FSVD的数学基础。随后,我们针对含噪声且观测不规则的函数数据,基于一种新颖的联合核岭回归方案开发了实现方法,并为其收敛性和估计精度提供了理论保证。FSVD框架还引入了本质基函数与本质基向量的概念,这两个概念代表了随机函数的两种基本统计结构,并将FSVD与功能主成分分析、因子模型、功能聚类及功能补全等多种任务联系起来。我们通过大量模拟研究,在多类任务中将FSVD的性能与现有方法进行了比较。为展示FSVD在实际数据集中的价值,我们将其应用于从COVID-19病例数据集中提取时序模式,并在电子健康记录数据集上执行数据补全任务。