Machine learning models trained with differentially-private (DP) algorithms such as DP-SGD enjoy resilience against a wide range of privacy attacks. Although it is possible to derive bounds for some attacks based solely on an $(\varepsilon,\delta)$-DP guarantee, meaningful bounds require a small enough privacy budget (i.e., injecting a large amount of noise), which results in a large loss in utility. This paper presents a new approach to evaluate the privacy of machine learning models against specific record-level threats, such as membership and attribute inference, without the indirection through DP. We focus on the popular DP-SGD algorithm, and derive simple closed-form bounds. Our proofs model DP-SGD as an information theoretic channel whose inputs are the secrets that an attacker wants to infer (e.g., membership of a data record) and whose outputs are the intermediate model parameters produced by iterative optimization. We obtain bounds for membership inference that match state-of-the-art techniques, whilst being orders of magnitude faster to compute. Additionally, we present a novel data-dependent bound against attribute inference. Our results provide a direct, interpretable, and practical way to evaluate the privacy of trained models against specific inference threats without sacrificing utility.
翻译:使用差分隐私(DP)算法(如DP-SGD)训练的机器学习模型能够抵御广泛的隐私攻击。尽管基于$(\varepsilon,\delta)$-DP保证可以为某些攻击推导出界限,但有意义的界限需要足够小的隐私预算(即注入大量噪声),这会导致较大的效用损失。本文提出了一种新方法,用于评估机器学习模型针对特定记录级威胁(如成员推断和属性推断)的隐私性,无需通过DP间接评估。我们聚焦于流行的DP-SGD算法,并推导出简单的闭式界。我们的证明将DP-SGD建模为一个信息论信道,其输入是攻击者想要推断的秘密(例如,某条数据记录的成员资格),输出则是迭代优化过程中产生的中间模型参数。我们获得的成员推断界限与最先进技术相当,但计算速度快数个数量级。此外,我们提出了一种新颖的数据依赖型属性推断界限。我们的结果为评估训练模型在特定推理威胁下的隐私性提供了一种直接、可解释且实用的方法,且不牺牲效用。