Membership Inference Attacks (MIAs) aim to identify specific data samples within the private training dataset of machine learning models, leading to serious privacy violations and other sophisticated threats. Many practical black-box MIAs require query access to the data distribution (the same distribution where the private data is drawn) to train shadow models. By doing so, the adversary obtains models trained "with" or "without" samples drawn from the distribution, and analyzes the characteristics of the samples under consideration. The adversary is often required to train more than hundreds of shadow models to extract the signals needed for MIAs; this becomes the computational overhead of MIAs. In this paper, we propose that by strategically choosing the samples, MI adversaries can maximize their attack success while minimizing the number of shadow models. First, our motivational experiments suggest memorization as the key property explaining disparate sample vulnerability to MIAs. We formalize this through a theoretical bound that connects MI advantage with memorization. Second, we show sample complexity bounds that connect the number of shadow models needed for MIAs with memorization. Lastly, we confirm our theoretical arguments with comprehensive experiments; by utilizing samples with high memorization scores, the adversary can (a) significantly improve its efficacy regardless of the MIA used, and (b) reduce the number of shadow models by nearly two orders of magnitude compared to state-of-the-art approaches.
翻译:成员推断攻击旨在识别机器学习模型私有训练数据集中的特定数据样本,从而导致严重的隐私侵犯及其他复杂威胁。许多实际的黑盒成员推断攻击需要访问数据分布(即私有数据来源的相同分布)以训练影子模型。通过这种方式,攻击者可以获得基于该分布“包含”或“不包含”样本训练的模型,并分析待考察样本的特征。攻击者通常需要训练数百个以上的影子模型来提取成员推断攻击所需的信号——这构成了成员推断攻击的计算开销。本文提出,通过策略性地选择样本,成员推断攻击者可以在最小化影子模型数量的同时最大化攻击成功率。首先,我们的动机实验表明,记忆性是解释样本对成员推断攻击脆弱性差异的关键属性。我们通过连接成员推断优势与记忆性的理论边界对此进行了形式化。其次,我们展示了将成员推断攻击所需影子模型数量与记忆性关联起来的样本复杂度边界。最后,我们通过全面实验验证了理论论证:通过利用高记忆性得分的样本,攻击者能够(a)无论使用何种成员推断攻击方法均可显著提升攻击效能,且(b)与最先进方法相比,将所需影子模型数量减少近两个数量级。