Machine learning models that are overfitted/overtrained are more vulnerable to knowledge leakage, which poses a risk to privacy. Suppose we download or receive a model from a third-party collaborator without knowing its training accuracy. How can we determine if it has been overfitted or overtrained on its training data? It's possible that the model was intentionally over-trained to make it vulnerable during testing. While an overfitted or overtrained model may perform well on testing data and even some generalization tests, we can't be sure it's not over-fitted. Conducting a comprehensive generalization test is also expensive. The goal of this paper is to address these issues and ensure the privacy and generalization of our method using only testing data. To achieve this, we analyze the null space in the last layer of neural networks, which enables us to quantify overfitting without access to training data or knowledge of the accuracy of those data. We evaluated our approach on various architectures and datasets and observed a distinct pattern in the angle of null space when models are overfitted. Furthermore, we show that models with poor generalization exhibit specific characteristics in this space. Our work represents the first attempt to quantify overfitting without access to training data or knowing any knowledge about the training samples.
翻译:机器学习模型若存在过拟合或过度训练,将更容易遭受知识泄露风险,从而对隐私构成威胁。假设我们从第三方合作者处下载或接收一个模型,却不知其训练精度——如何判断该模型是否在其训练数据上发生过拟合或过度训练?可能存在人为过度训练模型以使其在测试阶段更易受攻击的情况。尽管过拟合或过度训练的模型可能在测试数据甚至某些泛化测试中表现优良,但我们无法确证其未发生过拟合。开展全面的泛化测试成本亦相当高昂。本文旨在解决上述问题,并确保仅使用测试数据即可验证我们方法的隐私性与泛化能力。为此,我们分析了神经网络最后一层的零空间,这种分析无需访问训练数据或知晓这些数据的精度即可量化过拟合程度。我们基于多种架构与数据集评估了该方法,观察到过拟合模型在零空间角度上呈现显著模式。此外,我们证明泛化能力较差的模型在该空间中具有特定特征。本研究首次尝试在无需访问训练数据或掌握训练样本任何信息的情况下实现过拟合的量化。