Privately releasing marginals of a tabular dataset is a foundational problem in differential privacy. However, state-of-the-art mechanisms suffer from a computational bottleneck when marginal estimates are reconstructed from noisy measurements. Recently, residual queries were introduced and shown to lead to highly efficient reconstruction in the batch query answering setting. We introduce new techniques to integrate residual queries into state-of-the-art adaptive mechanisms such as AIM. Our contributions include a novel conceptual framework for residual queries using multi-dimensional arrays, lazy updating strategies, and adaptive optimization of the per-round privacy budget allocation. Together these contributions reduce error, improve speed, and simplify residual query operations. We integrate these innovations into a new mechanism (AIM+GReM), which improves AIM by using fast residual-based reconstruction instead of a graphical model approach. Our mechanism is orders of magnitude faster than the original framework and demonstrates competitive error and greatly improved scalability.
翻译:私有化发布表格数据集的边际分布是差分隐私领域的一个基础性问题。然而,当从带噪声的测量值中重构边际估计时,现有最先进的机制会面临计算瓶颈。最近,残差查询被提出,并被证明能在批量查询应答场景中实现高效的重构。我们引入了新技术,将残差查询整合到诸如AIM等最先进的自适应机制中。我们的贡献包括:一个基于多维数组的残差查询新概念框架、惰性更新策略,以及每轮隐私预算分配的自适应优化。这些贡献共同降低了误差、提升了速度,并简化了残差查询操作。我们将这些创新整合到一个新机制(AIM+GReM)中,该机制通过使用快速的基于残差的重构方法替代原有的图模型方法,对AIM进行了改进。我们的机制比原始框架快数个数量级,展现出具有竞争力的误差表现和显著提升的可扩展性。