Membership inference attacks (MIA) can reveal whether a particular data point was part of the training dataset, potentially exposing sensitive information about individuals. This article provides theoretical guarantees by exploring the fundamental statistical limitations associated with MIAs on machine learning models. More precisely, we first derive the statistical quantity that governs the effectiveness and success of such attacks. We then theoretically prove that in a non-linear regression setting with overfitting algorithms, attacks may have a high probability of success. Finally, we investigate several situations for which we provide bounds on this quantity of interest. Interestingly, our findings indicate that discretizing the data might enhance the algorithm's security. Specifically, it is demonstrated to be limited by a constant, which quantifies the diversity of the underlying data distribution. We illustrate those results through two simple simulations.
翻译:成员推理攻击能够揭示特定数据点是否属于训练数据集,从而可能暴露个人敏感信息。本文通过探索成员推理攻击在机器学习模型中的基本统计限制,提供了理论保障。具体而言,我们首先推导了控制此类攻击有效性和成功率的统计量。接着从理论上证明,在采用过拟合算法的非线性回归场景中,攻击具有较高的成功概率。最后,我们研究了若干场景,并给出了该统计量的界。有趣的是,我们的发现表明数据离散化可能增强算法的安全性——具体而言,其安全性受限于一个表征底层数据分布多样性的常数。我们通过两个简单仿真验证了上述结论。