This paper is devoted to the estimation of the Lipschitz constant of general neural network architectures using semidefinite programming. For this purpose, we interpret neural networks as time-varying dynamical systems, where the $k$-th layer corresponds to the dynamics at time $k$. A key novelty with respect to prior work is that we use this interpretation to exploit the series interconnection structure of feedforward neural networks with a dynamic programming recursion. Nonlinearities, such as activation functions and nonlinear pooling layers, are handled with integral quadratic constraints. If the neural network contains signal processing layers (convolutional or state space model layers), we realize them as 1-D/2-D/N-D systems and exploit this structure as well. We distinguish ourselves from related work on Lipschitz constant estimation by more extensive structure exploitation (scalability) and a generalization to a large class of common neural network architectures. To show the versatility and computational advantages of our method, we apply it to different neural network architectures trained on MNIST and CIFAR-10.
翻译:本文致力于利用半定规划方法估计通用神经网络架构的Lipschitz常数。为此,我们将神经网络解释为时变动力系统,其中第$k$层对应时刻$k$的动态特性。相较于现有研究,本工作的核心创新在于利用这种解释,通过动态规划递归开发前馈神经网络的级联互连结构。非线性组件(如激活函数和非线性池化层)通过积分二次约束进行处理。若神经网络包含信号处理层(卷积层或状态空间模型层),我们将其实现为1-D/2-D/N-D系统并充分利用其结构特性。本研究通过更深入的结构开发(可扩展性)以及对广泛常见神经网络架构的泛化能力,区别于现有Lipschitz常数估计的相关工作。为展示本方法的通用性与计算优势,我们将其应用于在MNIST和CIFAR-10数据集上训练的不同神经网络架构。