Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analyzing the rate of the excess risk of a CNN classifier trained by cross-entropy loss. Under suitable assumptions on the smoothness and structure of the a posteriori probability, it is shown that these classifiers achieve a rate of convergence which is independent of the dimension of the image. These rates are in line with the practical observations about CNNs.
翻译:使用交叉熵损失训练的卷积神经网络在图像分类中表现出色。近年来,大量研究致力于提升神经网络的理論理解,但当网络采用交叉熵损失训练时,这种理论分析仍显不足,主要源于目标函数的无界性。本文旨在通过分析交叉熵损失训练的CNN分类器的超额风险率来填补这一空白。在关于后验概率光滑性与结构的适当假设下,我们证明这类分类器的收敛速率与图像维度无关。这些速率与CNN的实际观测结果一致。