This paper examines robust functional data analysis for discretely observed data, where the underlying process encompasses various distributions, such as heavy tail, skewness, or contaminations. We propose a unified robust concept of functional mean, covariance, and principal component analysis, while existing methods and definitions often differ from one another or only address fully observed functions (the ``ideal'' case). Specifically, the robust functional mean can deviate from its non-robust counterpart and is estimated using robust local linear regression. Moreover, we define a new robust functional covariance that shares useful properties with the classic version. Importantly, this covariance yields the robust version of Karhunen--Lo\`eve decomposition and corresponding principal components beneficial for dimension reduction. The theoretical results of the robust functional mean, covariance, and eigenfunction estimates, based on pooling discretely observed data (ranging from sparse to dense), are established and aligned with their non-robust counterparts. The newly-proposed perturbation bounds for estimated eigenfunctions, with indexes allowed to grow with sample size, lay the foundation for further modeling based on robust functional principal component analysis.
翻译:本文探讨了针对离散观测数据的稳健函数型数据分析,其中潜在过程涵盖重尾、偏态或污染等各类分布。我们提出了函数均值、协方差及主成分分析的统一稳健概念框架,而现有方法与定义往往彼此存在差异,或仅能处理完全观测函数(即"理想情形")。具体而言,稳健函数均值可偏离其非稳健对应量,并通过稳健局部线性回归进行估计。此外,我们定义了一种新型稳健函数协方差,该协方差保留了经典版本的有用性质。重要的是,该协方差衍生了稳健的Karhunen-Loève分解及其相应主成分,对降维具有重要价值。基于汇集离散观测数据(覆盖稀疏至稠密场景)的稳健函数均值、协方差及特征函数估计的理论结果得以建立,并与其非稳健对应理论成果保持一致性。本文新提出的特征函数估计扰动界(允许指数随样本量增长)为基于稳健函数主成分分析的进一步建模奠定了基础。