Numerous studies have shown that label noise can lead to poor generalization performance, negatively affecting classification accuracy. Therefore, understanding the effectiveness of classifiers trained using deep neural networks in the presence of noisy labels is of considerable practical significance. In this paper, we focus on the error bounds of excess risks for classification problems with noisy labels within deep learning frameworks. We derive error bounds for the excess risk, decomposing it into statistical error and approximation error. To handle statistical dependencies (e.g., mixing sequences), we employ an independent block construction to bound the error, leveraging techniques for dependent processes. For the approximation error, we establish these theoretical results to the vector-valued setting, where the output space consists of $K$-dimensional unit vectors. Finally, under the low-dimensional manifold hypothesis, we further refine the approximation error to mitigate the impact of high-dimensional input spaces.
翻译:大量研究表明,标签噪声会导致泛化性能下降,对分类精度产生负面影响。因此,理解在存在噪声标签的情况下,使用深度神经网络训练的分类器的有效性具有重要的实际意义。本文聚焦于深度学习框架下带噪声标签分类问题的超额风险误差界。我们推导了超额风险的误差界,将其分解为统计误差与近似误差。为处理统计依赖性(如混合序列),我们采用独立块构造来界定误差,并利用针对依赖过程的技巧。对于近似误差,我们将这些理论结果推广到向量值设定,其中输出空间由 $K$ 维单位向量构成。最后,在低维流形假设下,我们进一步优化近似误差以减轻高维输入空间的影响。