In this paper, we explore the properties of loss curvature with respect to input data in deep neural networks. Curvature of loss with respect to input (termed input loss curvature) is the trace of the Hessian of the loss with respect to the input. We investigate how input loss curvature varies between train and test sets, and its implications for train-test distinguishability. We develop a theoretical framework that derives an upper bound on the train-test distinguishability based on privacy and the size of the training set. This novel insight fuels the development of a new black box membership inference attack utilizing input loss curvature. We validate our theoretical findings through experiments in computer vision classification tasks, demonstrating that input loss curvature surpasses existing methods in membership inference effectiveness. Our analysis highlights how the performance of membership inference attack (MIA) methods varies with the size of the training set, showing that curvature-based MIA outperforms other methods on sufficiently large datasets. This condition is often met by real datasets, as demonstrated by our results on CIFAR10, CIFAR100, and ImageNet. These findings not only advance our understanding of deep neural network behavior but also improve the ability to test privacy-preserving techniques in machine learning.
翻译:本文探讨了深度神经网络中损失函数相对于输入数据的曲率特性。损失相对于输入的曲率(称为输入损失曲率)是损失相对于输入的Hessian矩阵的迹。我们研究了输入损失曲率在训练集和测试集之间的变化规律,及其对训练-测试可区分性的影响。我们建立了一个理论框架,基于隐私性和训练集规模推导出训练-测试可区分性的上界。这一新颖见解推动了一种利用输入损失曲率的新型黑盒成员推理攻击方法的开发。通过在计算机视觉分类任务中的实验,我们验证了理论发现,证明输入损失曲率在成员推理效果上超越了现有方法。我们的分析揭示了成员推理攻击(MIA)方法性能如何随训练集规模变化,表明基于曲率的MIA在足够大的数据集上优于其他方法。真实数据集通常满足这一条件,正如我们在CIFAR10、CIFAR100和ImageNet上的实验结果所示。这些发现不仅增进了对深度神经网络行为的理解,还提升了测试机器学习隐私保护技术的能力。