In the era of big data, leveraging information from multiple clients while preserving data privacy has emerged as a critical challenge in modern statistical modeling and forecasting. This paper introduces a privacy-preserving federated learning framework for high-dimensional vector autoregressive models, where each client's dynamics are characterized by a common low-rank structure augmented with sparse client-specific deviations. We develop a two-stage estimation procedure that integrates differentially private representation learning for the shared component with local personalization for client-specific adjustments, enabling effective information pooling under selective privacy constraints. Non-asymptotic error bounds are established for both the single-client and federated estimators to characterize the inherent privacy-utility trade-off, and consistency of a ridge-type rank selection criterion is proved. Simulation studies demonstrate that federation substantially improves estimation accuracy when local sample sizes are limited. Two empirical applications to analyzing electricity-economy linkages across U.S. states and conducting multi-task macroeconomic forecasting across countries, highlight the superior predictive accuracy of the proposed method over existing single-client benchmarks.
翻译:在大数据时代,如何在多个客户端之间利用信息的同时保护数据隐私,已成为现代统计建模与预测中的关键挑战。本文针对高维向量自回归模型提出了一种隐私保护联邦学习框架,其中每个客户端的动态行为由共同的低秩结构与稀疏的客户端特定偏差共同表征。我们开发了一种两阶段估计方法,该方法将对共享组件的差分隐私表示学习与用于客户端特定调整的本地个性化相结合,从而在选择性隐私约束下实现有效的信息聚合。文章为单客户端估计器和联邦估计器建立了非渐近误差界,以刻画固有的隐私-效用权衡,并证明了岭型秩选择准则的一致性。仿真研究表明,当本地样本量有限时,联邦学习显著提升了估计精度。通过两个实证应用——分析美国各州电力-经济关联以及进行跨国多任务宏观经济预测——本文方法在预测准确性方面优于现有的单客户端基准方法。