A Unified Algebraic Perspective on Lipschitz Neural Networks

Important research efforts have focused on the design and training of neural networks with a controlled Lipschitz constant. The goal is to increase and sometimes guarantee the robustness against adversarial attacks. Recent promising techniques draw inspirations from different backgrounds to design 1-Lipschitz neural networks, just to name a few: convex potential layers derive from the discretization of continuous dynamical systems, Almost-Orthogonal-Layer proposes a tailored method for matrix rescaling. However, it is today important to consider the recent and promising contributions in the field under a common theoretical lens to better design new and improved layers. This paper introduces a novel algebraic perspective unifying various types of 1-Lipschitz neural networks, including the ones previously mentioned, along with methods based on orthogonality and spectral methods. Interestingly, we show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition. We also prove that AOL biases the scaled weight to the ones which are close to the set of orthogonal matrices in a certain mathematical manner. Moreover, our algebraic condition, combined with the Gershgorin circle theorem, readily leads to new and diverse parameterizations for 1-Lipschitz network layers. Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers. Finally, the comprehensive set of experiments on image classification shows that SLLs outperform previous approaches on certified robust accuracy. Code is available at https://github.com/araujoalexandre/Lipschitz-SLL-Networks.

翻译：重要的研究工作聚焦于具有可控Lipschitz常数的神经网络的设计与训练，其目标是提升乃至保证模型对对抗性攻击的鲁棒性。近期涌现的多种有前景技术从不同背景出发设计1-Lipschitz神经网络，例如：凸势层源自连续动力系统的离散化，近正交层则提出了一种针对矩阵重新缩放的定制化方法。然而，当前亟需从统一的理论视角审视该领域中这些新近且富有潜力的贡献，以更好地设计新型改进层。本文提出了一种新颖的代数视角，统一了包括前述方法在内的多种1-Lipschitz神经网络，以及基于正交性和谱方法的技术。有趣的是，我们证明许多现有技术可通过求解一个共同的半定规划（SDP）条件的解析解来推导和泛化。同时，我们揭示了近正交层（AOL）在特定数学意义上会使缩放后的权重倾向于接近正交矩阵集合。此外，我们的代数条件结合Gershgorin圆盘定理，可直接导出针对1-Lipschitz网络层的全新且多样化的参数化方法。我们提出的基于SDP的Lipschitz层（SLL），能够设计出非平凡且高效的凸势层泛化形式。最后，在图像分类任务上的综合实验表明，SLL在认证鲁棒准确率方面优于先前方法。代码开源地址：https://github.com/araujoalexandre/Lipschitz-SLL-Networks。