Machine learning poses severe privacy concerns as it has been shown that the learned models can reveal sensitive information about their training data. Many works have investigated the effect of widely-adopted data augmentation (DA) and adversarial training (AT) techniques, termed data enhancement in the paper, on the privacy leakage of machine learning models. Such privacy effects are often measured by membership inference attacks (MIAs), which aim to identify whether a particular example belongs to the training set or not. We propose to investigate privacy from a new perspective called memorization. Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks as members compared to samples with low privacy risks. To solve this problem, we deploy a recent attack that can capture individual samples' memorization degrees for evaluation. Through extensive experiments, we unveil non-trivial findings about the connections between three essential properties of machine learning models, including privacy, generalization gap, and adversarial robustness. We demonstrate that, unlike existing results, the generalization gap is shown not highly correlated with privacy leakage. Moreover, stronger adversarial robustness does not necessarily imply that the model is more susceptible to privacy attacks.
翻译:摘要:机器学习引发了严重的隐私问题,因为已证明学习到的模型可能泄露其训练数据中的敏感信息。许多研究探讨了广泛采用的数据增强(DA)和对抗训练(AT)技术(本文统称为数据增强技术)对机器学习模型隐私泄露的影响。此类隐私效应通常通过成员推断攻击(MIA)来衡量,该攻击旨在识别特定样本是否属于训练集。我们提出从一种称为“记忆化”的新视角研究隐私问题。通过记忆化视角,我们发现先前部署的MIA会产生误导性结果,因为与低隐私风险的样本相比,这些攻击更不易将高隐私风险的样本识别为成员。为解决该问题,我们采用了一种能够捕捉单个样本记忆化程度的最新攻击方法进行评估。通过大量实验,我们揭示了机器学习模型三个关键属性(隐私性、泛化差距和对抗鲁棒性)之间的非平凡关联。研究表明,与现有结论不同,泛化差距与隐私泄露的相关性并不高;此外,更强的对抗鲁棒性并不一定意味着模型更容易遭受隐私攻击。