Neural networks have gained much interest because of their effectiveness in many applications. However, their mathematical properties are generally not well understood. If there is some underlying geometric structure inherent to the data or to the function to approximate, it is often desirable to take this into account in the design of the neural network. In this work, we start with a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical time-discretisation. The structure of the neural network is then inferred from the properties of the ODE vector field. Besides injecting more structure into the network architectures, this modelling procedure allows a better theoretical understanding of their behaviour. We present two universal approximation results and demonstrate how to impose some particular properties on the neural networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz. These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10 and CIFAR-100 datasets.
翻译:神经网络因其在许多应用中的有效性而备受关注。然而,其数学性质通常尚未得到充分理解。如果数据或待逼近函数存在某种内在的几何结构,在神经网络设计中考虑这一结构往往是有益的。本研究从非自治常微分方程出发,通过合适的、保持结构的数值时间离散化方法构建神经网络。网络结构随后从常微分方程向量场的性质中推导出来。这种建模过程不仅为网络架构注入更多结构,还使对网络行为的理论理解更为深入。我们提出了两个通用逼近定理,并展示了如何在神经网络中施加特定性质。特别关注包含非1-利普希茨层的1-利普希茨架构。在CIFAR-10和CIFAR-100数据集上的实验表明,这些网络具有表达力且对对抗性攻击具有鲁棒性。