A Model Inversion (MI) attack based on Generative Adversarial Networks (GAN) aims to recover the private training data from complex deep learning models by searching codes in the latent space. However, they merely search a deterministic latent space such that the found latent code is usually suboptimal. In addition, the existing distributional MI schemes assume that an attacker can access the structures and parameters of the target model, which is not always viable in practice. To overcome the above shortcomings, this paper proposes a novel Distributional Black-Box Model Inversion (DBB-MI) attack by constructing the probabilistic latent space for searching the target privacy data. Specifically, DBB-MI does not need the target model parameters or specialized GAN training. Instead, it finds the latent probability distribution by combining the output of the target model with multi-agent reinforcement learning techniques. Then, it randomly chooses latent codes from the latent probability distribution for recovering the private data. As the latent probability distribution closely aligns with the target privacy data in latent space, the recovered data will leak the privacy of training samples of the target model significantly. Abundant experiments conducted on diverse datasets and networks show that the present DBB-MI has better performance than state-of-the-art in attack accuracy, K-nearest neighbor feature distance, and Peak Signal-to-Noise Ratio.
翻译:模型反转(MI)攻击基于生成对抗网络(GAN),旨在通过在潜在空间中搜索编码来恢复复杂深度学习模型中的私有训练数据。然而,现有方法仅在确定性潜在空间中搜索,导致找到的潜在编码通常为次优解。此外,现有分布式MI方案假设攻击者可访问目标模型的结构与参数,这在实践中常难以实现。为克服上述不足,本文提出一种新颖的分布式黑盒模型反转(DBB-MI)攻击,通过构建概率潜在空间来搜索目标隐私数据。具体而言,DBB-MI无需目标模型参数或专门的GAN训练,而是结合目标模型输出与多智能体强化学习技术来寻找潜在概率分布,随后从该分布中随机选取潜在编码以恢复私有数据。由于潜在概率分布与目标隐私数据在潜在空间中的分布高度吻合,恢复的数据将显著泄露目标模型训练样本的隐私信息。在多个数据集与网络架构上开展的大量实验表明,本文提出的DBB-MI方法在攻击精度、K近邻特征距离及峰值信噪比等指标上均优于现有最优方法。