Standard deep learning architectures used for classification generate label predictions with a projection head and softmax activation function. Although successful, these methods fail to leverage the relational information between samples in the batch for generating label predictions. In recent works, graph-based learning techniques, namely Laplace learning, have been heuristically combined with neural networks for both supervised and semi-supervised learning (SSL) tasks. However, prior works approximate the gradient of the loss function with respect to the graph learning algorithm or decouple the processes; end-to-end integration with neural networks is not achieved. In this work, we derive backpropagation equations, via the adjoint method, for inclusion of a general family of graph learning layers into a neural network. This allows us to precisely integrate graph Laplacian-based label propagation into a neural network layer, replacing a projection head and softmax activation function for classification tasks. Using this new framework, our experimental results demonstrate smooth label transitions across data, improved robustness to adversarial attacks, improved generalization, and improved training dynamics compared to the standard softmax-based approach.
翻译:用于分类的标准深度学习架构通过投影头和softmax激活函数生成标签预测。尽管这些方法取得了成功,但它们未能利用批次中样本之间的关联信息来生成标签预测。在近期研究中,基于图的学习技术(即拉普拉斯学习)已启发式地与神经网络结合,用于监督学习和半监督学习任务。然而,先前的研究要么近似损失函数相对于图学习算法的梯度,要么将两个过程解耦,未能实现与神经网络的端到端集成。在本研究中,我们通过伴随方法推导了将通用图学习层集成到神经网络中的反向传播方程。这使得我们能够将基于图拉普拉斯的标签传播精确地集成到神经网络层中,从而替代分类任务中的投影头和softmax激活函数。使用这一新框架,我们的实验结果表明:与基于softmax的标准方法相比,该方法实现了跨数据的平滑标签过渡、更强的对抗攻击鲁棒性、更好的泛化能力以及更优的训练动态。