Although much research has been done on proposing new models or loss functions to improve the generalisation of artificial neural networks (ANNs), less attention has been directed to the impact of the training data on generalisation. In this work, we start from approximating the interaction between samples, i.e. how learning one sample would modify the model's prediction on other samples. Through analysing the terms involved in weight updates in supervised learning, we find that labels influence the interaction between samples. Therefore, we propose the labelled pseudo Neural Tangent Kernel (lpNTK) which takes label information into consideration when measuring the interactions between samples. We first prove that lpNTK asymptotically converges to the empirical neural tangent kernel in terms of the Frobenius norm under certain assumptions. Secondly, we illustrate how lpNTK helps to understand learning phenomena identified in previous work, specifically the learning difficulty of samples and forgetting events during learning. Moreover, we also show that using lpNTK to identify and remove poisoning training samples does not hurt the generalisation performance of ANNs.
翻译:尽管已有大量研究关注提出新模型或损失函数以提升人工神经网络的泛化能力,但训练数据对泛化影响的问题较少受到关注。本文从样本间交互的近似出发,即学习一个样本会如何修改模型对其他样本的预测。通过分析监督学习中权重更新所涉及的各项,我们发现标签会影响样本间的交互。因此,我们提出了带标签的伪神经正切核(lpNTK),该核在测量样本间交互时考虑了标签信息。我们首先证明,在特定假设下,lpNTK在Frobenius范数意义上渐近收敛于经验神经正切核。其次,我们阐释了lpNTK如何帮助理解先前工作中识别出的学习现象,具体包括样本的学习难度以及学习过程中的遗忘事件。此外,我们还表明,利用lpNTK识别并移除中毒训练样本并不会损害人工神经网络的泛化性能。