It is a highly desirable property for deep networks to be robust against small input changes. One popular way to achieve this property is by designing networks with a small Lipschitz constant. In this work, we propose a new technique for constructing such Lipschitz networks that has a number of desirable properties: it can be applied to any linear network layer (fully-connected or convolutional), it provides formal guarantees on the Lipschitz constant, it is easy to implement and efficient to run, and it can be combined with any training objective and optimization method. In fact, our technique is the first one in the literature that achieves all of these properties simultaneously. Our main contribution is a rescaling-based weight matrix parametrization that guarantees each network layer to have a Lipschitz constant of at most 1 and results in the learned weight matrices to be close to orthogonal. Hence we call such layers almost-orthogonal Lipschitz (AOL). Experiments and ablation studies in the context of image classification with certified robust accuracy confirm that AOL layers achieve results that are on par with most existing methods. Yet, they are simpler to implement and more broadly applicable, because they do not require computationally expensive matrix orthogonalization or inversion steps as part of the network architecture. We provide code at https://github.com/berndprach/AOL.
翻译:深度网络对小输入变化具有鲁棒性是一项非常理想的特性。实现该特性的主流方法之一是设计具有较小Lipschitz常数的网络。本文提出了一种构建此类Lipschitz网络的新技术,该技术具备多项优良特性:可应用于任意线性网络层(全连接层或卷积层)、提供Lipschitz常数的形式化保证、易于实现且运行高效、并能与任意训练目标和优化方法相结合。事实上,本技术是文献中首个同时实现上述所有特性的方法。我们的核心贡献在于提出一种基于重标定的权重矩阵参数化方案,该方案可确保每个网络层的Lipschitz常数不超过1,并使学习得到的权重矩阵接近正交。因此,我们将此类层称为近似正交Lipschitz层(Almost-Orthogonal Lipschitz, AOL)。在具有认证鲁棒精度的图像分类任务中进行的实验与消融研究证实,AOL层能够达到与多数现有方法相当的性能。然而,由于无需在网络架构中引入计算昂贵的矩阵正交化或求逆步骤,AOL层更易于实现且具有更广泛的适用性。代码开源地址:https://github.com/berndprach/AOL。