Differentially Private Manifold Denoising

We introduce a differentially private manifold denoising framework that allows users to exploit sensitive reference datasets to correct noisy, non-private query points without compromising privacy. The method follows an iterative procedure that (i) privately estimates local means and tangent geometry using the reference data under calibrated sensitivity, (ii) projects query points along the privately estimated subspace toward the local mean via corrective steps at each iteration, and (iii) performs rigorous privacy accounting across iterations and queries using $(\varepsilon,δ)$-differential privacy (DP). Conceptually, this framework brings differential privacy to manifold methods, retaining sufficient geometric signal for downstream tasks such as embedding, clustering, and visualization, while providing formal DP guarantees for the reference data. Practically, the procedure is modular and scalable, separating DP-protected local geometry (means and tangents) from budgeted query-point updates, with a simple scheduler allocating privacy budget across iterations and queries. Under standard assumptions on manifold regularity, sampling density, and measurement noise, we establish high-probability utility guarantees showing that corrected queries converge toward the manifold at a non-asymptotic rate governed by sample size, noise level, bandwidth, and the privacy budget. Simulations and case studies demonstrate accurate signal recovery under moderate privacy budgets, illustrating clear utility-privacy trade-offs and providing a deployable DP component for manifold-based workflows in regulated environments without reengineering privacy systems.

翻译：我们提出了一种差分隐私流形去噪框架，允许用户利用敏感的参考数据集来校正含噪声的非私有查询点，同时不损害隐私。该方法遵循迭代流程：（i）在经校准的敏感度下，利用参考数据私有地估计局部均值和切空间几何；（ii）通过每次迭代中的校正步骤，将查询点沿私有估计的子空间向局部均值投影；（iii）使用$(\varepsilon,δ)$-差分隐私（DP）对跨迭代和查询的隐私消耗进行严格核算。从概念上讲，该框架将差分隐私引入流形方法，为下游任务（如嵌入、聚类和可视化）保留足够的几何信号，同时为参考数据提供形式化的DP保证。在实际应用中，该流程具有模块化和可扩展性，将受DP保护的局部几何结构（均值和切向量）与预算约束的查询点更新分离，并通过一个简单的调度器在迭代和查询间分配隐私预算。在流形正则性、采样密度和测量噪声的标准假设下，我们建立了高概率效用保证，表明校正后的查询点以由样本量、噪声水平、带宽和隐私预算控制的非渐近速率向流形收敛。仿真和案例研究表明，在适中隐私预算下可实现精确的信号恢复，展示了清晰的效用-隐私权衡，并为受监管环境中基于流形的工作流提供了可直接部署的DP组件，而无需重新设计隐私系统。