Functional principal component analysis (FPCA) is a key tool in the study of functional data, driving both exploratory analyses and feature construction for use in formal modeling and testing procedures. However, existing methods for FPCA do not apply when functional observations are truncated, e.g., the measurement instrument only supports recordings within a pre-specified interval, thereby truncating values outside of the range to the nearest boundary. A naive application of existing methods without correction for truncation induces bias. We extend the FPCA framework to accommodate truncated noisy functional data by first recovering smooth mean and covariance surface estimates that are representative of the latent process's mean and covariance functions. Unlike traditional sample covariance smoothing techniques, our procedure yields a positive semi-definite covariance surface, computed without the need to retroactively remove negative eigenvalues in the covariance operator decomposition. Additionally, we construct a FPC score predictor and demonstrate its use in the generalized functional linear model. Convergence rates for the proposed estimators are provided. In simulation experiments, the proposed method yields better predictive performance and lower bias than existing alternatives. We illustrate its practical value through an application to a study with truncated blood glucose measurements.
翻译:功能主成分分析(FPCA)是功能数据研究中的关键工具,既推动探索性分析,也为形式化建模与检验程序提供特征构建基础。然而,当功能观测数据存在截断时(例如测量仪器仅支持在预设区间内记录,从而将超出范围的值截断至最近边界),现有FPCA方法均无法直接适用。若未经截断校正而直接应用现有方法将导致估计偏差。本文通过首先恢复代表潜在过程均值与协方差函数的平滑均值及协方差曲面估计,将FPCA框架扩展至适用于含噪声的截断功能数据。与传统样本协方差平滑技术不同,本方法能直接获得半正定协方差曲面,无需在协方差算子分解中事后消除负特征值。此外,我们构建了FPC得分预测器,并演示其在广义功能线性模型中的应用。文中给出了所提估计量的收敛速率。模拟实验表明,相较于现有方法,本方法具有更优的预测性能与更低的估计偏差。我们通过截断血糖测量数据的实际研究案例,阐明了该方法的实用价值。