Membership Inference Attack (MIA) identifies whether a record exists in a machine learning model's training set by querying the model. MIAs on the classic classification models have been well-studied, and recent works have started to explore how to transplant MIA onto generative models. Our investigation indicates that existing MIAs designed for generative models mainly depend on the overfitting in target models. However, overfitting can be avoided by employing various regularization techniques, whereas existing MIAs demonstrate poor performance in practice. Unlike overfitting, memorization is essential for deep learning models to attain optimal performance, making it a more prevalent phenomenon. Memorization in generative models leads to an increasing trend in the probability distribution of generating records around the member record. Therefore, we propose a Probabilistic Fluctuation Assessing Membership Inference Attack (PFAMI), a black-box MIA that infers memberships by detecting these trends via analyzing the overall probabilistic fluctuations around given records. We conduct extensive experiments across multiple generative models and datasets, which demonstrate PFAMI can improve the attack success rate (ASR) by about 27.9% when compared with the best baseline.
翻译:成员推断攻击通过查询机器学习模型来判断某条记录是否存在于其训练集中。针对经典分类模型的成员推断攻击已得到充分研究,近期工作开始探索如何将此类攻击移植到生成模型上。我们的研究表明,现有面向生成模型的成员推断攻击主要依赖于目标模型的过拟合现象。然而,通过采用多种正则化技术可有效避免过拟合,这导致现有攻击方法在实际应用中表现欠佳。与过拟合不同,记忆效应是深度学习模型获得最优性能所必需的特性,因而成为更普遍存在的现象。生成模型中的记忆效应会导致生成记录的概率分布在成员记录附近呈现递增趋势。为此,我们提出一种基于概率波动评估的成员推断攻击方法,该黑盒攻击通过分析给定记录周围的整体概率波动来检测此类趋势,进而推断成员关系。我们在多种生成模型和数据集上进行了大量实验,结果表明相较于最佳基线方法,本方法可将攻击成功率提升约27.9%。