Neural networks are notoriously vulnerable to adversarial attacks -- small imperceptible perturbations that can change the network's output drastically. In the reverse direction, there may exist large, meaningful perturbations that leave the network's decision unchanged (excessive invariance, nonivertibility). We study the latter phenomenon in two contexts: (a) discrete-time dynamical system identification, as well as (b) calibration of the output of one neural network to the output of another (neural network matching). For ReLU networks and $L_p$ norms ($p=1,2,\infty$), we formulate these optimization problems as mixed-integer programs (MIPs) that apply to neural network approximators of dynamical systems. We also discuss the applicability of our results to invertibility certification in transformations between neural networks (e.g. at different levels of pruning).
翻译:神经网络因其对对抗性攻击的脆弱性而闻名——微小、不可察觉的扰动即可彻底改变网络输出。反之,可能存在大量有意义的扰动,却不会改变网络决策(过度不变性、不可逆性)。我们将在两个背景下研究后者现象:(a) 离散时间动力系统辨识,以及(b) 一个神经网络输出到另一个神经网络的输出校准(神经网络匹配)。针对ReLU网络和$L_p$范数($p=1,2,\infty$),我们将这些优化问题表述为适用于动力系统神经网络逼近器的混合整数规划(MIP)。我们还将讨论所得结果在神经网络间变换(如不同剪枝程度)可逆性认证中的适用性。