Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations

Recently, semidefinite programming (SDP) techniques have shown great promise in providing accurate Lipschitz bounds for neural networks. Specifically, the LipSDP approach (Fazlyab et al., 2019) has received much attention and provides the least conservative Lipschitz upper bounds that can be computed with polynomial time guarantees. However, one main restriction of LipSDP is that its formulation requires the activation functions to be slope-restricted on $[0,1]$, preventing its further use for more general activation functions such as GroupSort, MaxMin, and Householder. One can rewrite MaxMin activations for example as residual ReLU networks. However, a direct application of LipSDP to the resultant residual ReLU networks is conservative and even fails in recovering the well-known fact that the MaxMin activation is 1-Lipschitz. Our paper bridges this gap and extends LipSDP beyond slope-restricted activation functions. To this end, we provide novel quadratic constraints for GroupSort, MaxMin, and Householder activations via leveraging their underlying properties such as sum preservation. Our proposed analysis is general and provides a unified approach for estimating $\ell_2$ and $\ell_\infty$ Lipschitz bounds for a rich class of neural network architectures, including non-residual and residual neural networks and implicit models, with GroupSort, MaxMin, and Householder activations. Finally, we illustrate the utility of our approach with a variety of experiments and show that our proposed SDPs generate less conservative Lipschitz bounds in comparison to existing approaches.

翻译：最近，半定规划（SDP）技术在为神经网络提供精确Lipschitz界方面展现出巨大潜力。具体而言，LipSDP方法（Fazlyab等，2019）备受关注，它能以多项式时间保证计算出最不保守的Lipschitz上界。然而，LipSDP的主要限制之一在于其公式要求激活函数在$[0,1]$上具有斜率限制，从而阻碍了其进一步应用于更通用的激活函数（如GroupSort、MaxMin和Householder）。例如，可以将MaxMin激活函数重写为残差ReLU网络。但将LipSDP直接应用于由此产生的残差ReLU网络是保守的，甚至无法恢复MaxMin激活函数是1-Lipschitz这一众所周知的事实。本文填补了这一空白，将LipSDP推广到斜率限制激活函数之外。为此，我们通过利用GroupSort、MaxMin和Householder激活函数的底层特性（例如和保持性），为其提供了新颖的二次约束。我们提出的分析具有通用性，并提供了一种统一的方法，用于估计包含GroupSort、MaxMin和Householder激活函数的一类丰富神经网络架构（包括非残差和残差神经网络以及隐式模型）的$\ell_2$和$\ell_\infty$ Lipschitz界。最后，我们通过多种实验展示了我们方法的实用性，并表明与现有方法相比，我们提出的SDP生成的Lipschitz界更不保守。