This paper proposes distributed estimation procedures for three scalar-on-function regression models: the functional linear model (FLM), the functional non-parametric model (FNPM), and the functional partial linear model (FPLM). The framework addresses two key challenges in functional data analysis, namely the high computational cost of large samples and limitations on sharing raw data across institutions. Monte Carlo simulations show that the distributed estimators substantially reduce computation time while preserving high estimation and prediction accuracy for all three models. When block sizes become too small, the FPLM exhibits overfitting, leading to narrower prediction intervals and reduced empirical coverage probability. An example of an empirical study using the \textit{tecator} dataset further supports these findings.
翻译:本文针对三种标量-函数回归模型提出了分布式估计方法:函数线性模型(FLM)、函数非参数模型(FNPM)以及函数部分线性模型(FPLM)。该框架解决了函数数据分析中的两个关键挑战:大样本的高计算成本以及跨机构共享原始数据的限制。蒙特卡洛模拟表明,分布式估计器在保持三种模型高估计和预测精度的同时,显著减少了计算时间。当数据块规模过小时,FPLM会出现过拟合现象,导致预测区间变窄且经验覆盖概率降低。使用\textit{tecator}数据集进行的实证研究进一步支持了这些发现。