Membership inference attacks (MIA) can reveal whether a particular data point was part of the training dataset, potentially exposing sensitive information about individuals. This article explores the fundamental statistical limitations associated with MIAs on machine learning models. More precisely, we first derive the statistical quantity that governs the effectiveness and success of such attacks. Then, we investigate several situations for which we provide bounds on this quantity of interest. This allows us to infer the accuracy of potential attacks as a function of the number of samples and other structural parameters of learning models, which in some cases can be directly estimated from the dataset.
翻译:成员推理攻击能够揭示特定数据点是否属于训练数据集,从而可能暴露个体的敏感信息。本文探讨了针对机器学习模型的成员推理攻击所面临的基本统计局限性。具体而言,我们首先推导出决定此类攻击有效性和成功程度的统计量。随后,我们研究了多种情形,并就这一关键统计量给出了边界条件。这使我们能够根据样本数量及学习模型的其他结构参数推断潜在攻击的准确性,在某些情况下这些参数可直接从数据集中估计得出。