The generalization performance of deep neural networks in classification tasks is a major concern in machine learning research. Despite widespread techniques used to diminish the over-fitting issue such as data augmentation, pseudo-labeling, regularization, and ensemble learning, this performance still needs to be enhanced with other approaches. In recent years, it has been theoretically demonstrated that the loss function characteristics i.e. its Lipschitzness and maximum value affect the generalization performance of deep neural networks which can be utilized as a guidance to propose novel distance measures. In this paper, by analyzing the aforementioned characteristics, we introduce a distance called Reduced Jeffries-Matusita as a loss function for training deep classification models to reduce the over-fitting issue. In our experiments, we evaluate the new loss function in two different problems: image classification in computer vision and node classification in the context of graph learning. The results show that the new distance measure stabilizes the training process significantly, enhances the generalization ability, and improves the performance of the models in the Accuracy and F1-score metrics, even if the training set size is small.
翻译:在分类任务中,深度神经网络的泛化性能是机器学习领域的重要关注点。尽管数据增强、伪标签、正则化和集成学习等广泛采用的技术能够缓解过拟合问题,但该性能仍需通过其他方法进一步优化。近年来,理论研究已证明损失函数的特性(即其Lipschitz性质与最大值)会影响深度神经网络的泛化性能,这一结论可作为提出新型距离度量的指导原则。本文通过分析上述特性,提出了一种称为"缩减Jeffries-Matusita距离"的距离度量,并将其作为训练深度分类模型的损失函数,以缓解过拟合问题。在实验中,我们从两个不同问题场景评估了新损失函数:计算机视觉中的图像分类任务和图学习中的节点分类任务。结果表明,即使训练集规模较小,该新型距离度量也能显著稳定训练过程、增强模型泛化能力,并在Accuracy和F1-score指标上提升模型性能。