Reconstruction attacks and defenses are essential in understanding the data leakage problem in machine learning. However, prior work has centered around empirical observations of gradient inversion attacks, lacks theoretical justifications, and cannot disentangle the usefulness of defending methods from the computational limitation of attacking methods. In this work, we propose to view the problem as an inverse problem, enabling us to theoretically, quantitatively, and systematically evaluate the data reconstruction problem. On various defense methods, we derived the algorithmic upper bound and the matching (in feature dimension and model width) information-theoretical lower bound on the reconstruction error for two-layer neural networks. To complement the theoretical results and investigate the utility-privacy trade-off, we defined a natural evaluation metric of the defense methods with similar utility loss among the strongest attacks. We further propose a strong reconstruction attack that helps update some previous understanding of the strength of defense methods under our proposed evaluation metric.
翻译:重建攻击与防御对于理解机器学习中的数据泄漏问题至关重要。然而,先前研究主要集中于梯度反演攻击的经验性观察,缺乏理论依据,且无法将防御方法的有效性从攻击方法的计算限制中分离出来。本工作中,我们提出将该问题视为逆问题,从而能够在理论上、定量地、系统地评估数据重建问题。针对多种防御方法,我们推导了双层神经网络重建误差的算法上界,以及与之匹配(在特征维度和模型宽度上)的信息论下界。为补充理论结果并研究效用-隐私权衡,我们定义了在最强攻击下具有相似效用损失的防御方法自然评估指标。我们进一步提出一种强大的重建攻击,该攻击有助于更新先前关于防御方法在我们提出的评估指标下强度的理解。