Deep Neural Nets (DNNs) have become a pervasive tool for solving many emerging problems. However, they tend to overfit to and memorize the training set. Memorization is of keen interest since it is closely related to several concepts such as generalization, noisy learning, and privacy. To study memorization, Feldman (2019) proposed a formal score, however its computational requirements limit its practical use. Recent research has shown empirical evidence linking input loss curvature (measured by the trace of the loss Hessian w.r.t inputs) and memorization. It was shown to be ~3 orders of magnitude more efficient than calculating the memorization score. However, there is a lack of theoretical understanding linking memorization with input loss curvature. In this paper, we not only investigate this connection but also extend our analysis to establish theoretical links between differential privacy, memorization, and input loss curvature. First, we derive an upper bound on memorization characterized by both differential privacy and input loss curvature. Second, we present a novel insight showing that input loss curvature is upper-bounded by the differential privacy parameter. Our theoretical findings are further empirically validated using deep models on CIFAR and ImageNet datasets, showing a strong correlation between our theoretical predictions and results observed in practice.
翻译:深度神经网络已成为解决众多新兴问题的普遍工具。然而,它们倾向于过拟合并记忆训练集。记忆化因其与泛化、噪声学习和隐私等概念密切相关而备受关注。为研究记忆化,Feldman(2019)提出了一种形式化评分方法,但其计算需求限制了实际应用。近期研究显示,输入损失曲率(通过输入损失函数Hessian矩阵的迹衡量)与记忆化之间存在经验证据关联,其计算效率比记忆化评分高出约三个数量级。然而,目前缺乏将记忆化与输入损失曲率关联起来的理论理解。本文不仅探究了这种联系,还将分析扩展至建立差分隐私、记忆化与输入损失曲率之间的理论关联。首先,我们推导出由差分隐私和输入损失曲率共同表征的记忆化上界。其次,我们提出新见解表明输入损失曲率受差分隐私参数的上界约束。我们的理论发现通过CIFAR和ImageNet数据集上的深度模型得到了进一步实证验证,显示出理论预测与实际观测结果之间的强相关性。