Recent methods for auditing the privacy of machine learning algorithms have improved computational efficiency by simultaneously intervening on multiple training examples in a single training run. Steinke et al. (2024) prove that one-run auditing indeed lower bounds the true privacy parameter of the audited algorithm, and give impressive empirical results. Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that the key barrier to its efficacy is interference between the observable effects of different data elements. We present new conceptual approaches to minimize this barrier, towards improving the performance of one-run auditing of real machine learning algorithms.
翻译:近期用于审计机器学习算法隐私保护性能的方法通过单次训练中同时干预多个训练样本,显著提升了计算效率。Steinke等人(2024)证明了单次审计确实能够为被审计算法的真实隐私参数提供下界,并展示了令人印象深刻的实证结果。然而,他们的研究遗留了一个关键问题:单次审计能在多大程度上精确揭示算法的真实隐私参数?这种精度如何依赖于被审计算法的特性?本研究通过理论分析刻画了单次审计所能达到的最大效能,并证明其效能的主要障碍源于不同数据元素可观测效应之间的相互干扰。我们提出了新的概念性方法以最小化这种干扰,旨在提升实际机器学习算法单次审计的性能表现。