Diffusion models have begun to overshadow GANs and other generative models in industrial applications due to their superior image generation performance. The complex architecture of these models furnishes an extensive array of attack features. In light of this, we aim to design membership inference attacks (MIAs) catered to diffusion models. We first conduct an exhaustive analysis of existing MIAs on diffusion models, taking into account factors such as black-box/white-box models and the selection of attack features. We found that white-box attacks are highly applicable in real-world scenarios, and the most effective attacks presently are white-box. Departing from earlier research, which employs model loss as the attack feature for white-box MIAs, we employ model gradients in our attack, leveraging the fact that these gradients provide a more profound understanding of model responses to various samples. We subject these models to rigorous testing across a range of parameters, including training steps, sampling frequency, diffusion steps, and data variance. Across all experimental settings, our method consistently demonstrated near-flawless attack performance, with attack success rate approaching $100\%$ and attack AUCROC near $1.0$. We also evaluate our attack against common defense mechanisms, and observe our attacks continue to exhibit commendable performance.
翻译:扩散模型因其卓越的图像生成性能,已开始在工业应用中超越GAN及其他生成模型。这些模型的复杂架构提供了丰富的攻击特征。基于此,我们旨在设计适用于扩散模型的成员推理攻击。我们首先对现有针对扩散模型的成员推理攻击进行了详尽分析,考虑了黑盒/白盒模型及攻击特征选择等因素。我们发现白盒攻击在实际场景中高度适用,且当前最有效的攻击均为白盒方式。与先前采用模型损失作为白盒成员推理攻击特征的研究不同,我们利用模型梯度进行攻击,这些梯度能更深入地揭示模型对不同样本的响应。我们在训练步数、采样频率、扩散步数及数据方差等一系列参数下对这些模型进行了严格测试。在所有实验设置中,我们的方法均展现出近乎完美的攻击性能,攻击成功率接近100%,攻击AUCROC接近1.0。我们还评估了攻击对常见防御机制的有效性,观察到我们的攻击仍能保持优异表现。