We study continual mean estimation, where data vectors arrive sequentially and the goal is to maintain accurate estimates of the running mean. We address this problem under user-level differential privacy, which protects each user's entire dataset even when they contribute multiple data points. Previous work on this problem has focused on pure differential privacy. While important, this approach limits applicability, as it leads to overly noisy estimates. In contrast, we analyze the problem under approximate differential privacy, adopting recent advances in the Matrix Factorization mechanism. We introduce a novel mean estimation specific factorization, which is both efficient and accurate, achieving asymptotically lower mean-squared error bounds in continual mean estimation under user-level differential privacy.
翻译:本文研究连续均值估计问题,其中数据向量按序到达,目标是持续维护运行均值的准确估计。我们在用户级差分隐私框架下处理该问题,该框架能保护每个用户的完整数据集,即使他们贡献了多个数据点。该问题的已有研究主要集中于纯差分隐私。尽管这种方法具有重要意义,但其适用性受限,因为它会导致估计结果包含过多噪声。相比之下,我们在近似差分隐私框架下分析该问题,并采用矩阵分解机制的最新进展。我们提出了一种专为均值估计设计的新型分解方法,该方法兼具高效性与准确性,在用户级差分隐私下的连续均值估计中实现了渐近更低的均方误差界。