We systematically investigate the preservation of differential privacy in functional data analysis, beginning with functional mean estimation and extending to varying coefficient model estimation. Our work introduces a distributed learning framework involving multiple servers, each responsible for collecting several sparsely observed functions. This hierarchical setup introduces a mixed notion of privacy. Within each function, user-level differential privacy is applied to $m$ discrete observations. At the server level, central differential privacy is deployed to account for the centralised nature of data collection. Across servers, only private information is exchanged, adhering to federated differential privacy constraints. To address this complex hierarchy, we employ minimax theory to reveal several fundamental phenomena: from sparse to dense functional data analysis, from user-level to central and federated differential privacy costs, and the intricate interplay between different regimes of functional data analysis and privacy preservation. To the best of our knowledge, this is the first study to rigorously examine functional data estimation under multiple privacy constraints. Our theoretical findings are complemented by efficient private algorithms and extensive numerical evidence, providing a comprehensive exploration of this challenging problem.
翻译:我们系统性地研究了函数数据分析中的差分隐私保护问题,从函数均值估计出发,并扩展至变系数模型估计。本文提出了一种涉及多个服务器的分布式学习框架,每个服务器负责收集若干稀疏观测函数。这种分层结构引入了混合隐私概念:在单个函数内部,对$m$个离散观测值应用用户级差分隐私;在服务器层面,针对数据收集的集中化特性部署中心差分隐私;在服务器之间,仅交换隐私信息并遵循联邦差分隐私约束。为处理这一复杂层级结构,我们采用极小极大理论揭示了若干基本现象:从稀疏到稠密的函数数据分析,从用户级到中心及联邦差分隐私的成本差异,以及函数数据分析与隐私保护不同机制间的复杂相互作用。据我们所知,这是首个在多隐私约束下严格检验函数数据估计的研究。我们的理论发现辅以高效的隐私保护算法和大量数值证据,为这一挑战性问题提供了全面探索。