ReLU神经网络的凸性：超越输入凸神经网络？ (Convexity in ReLU Neural Networks: beyond ICNNs?)

Convex functions and their gradients play a critical role in mathematical imaging, from proximal optimization to Optimal Transport. The successes of deep learning has led many to use learning-based methods, where fixed functions or operators are replaced by learned neural networks. Regardless of their empirical superiority, establishing rigorous guarantees for these methods often requires to impose structural constraints on neural architectures, in particular convexity. The most popular way to do so is to use so-called Input Convex Neural Networks (ICNNs). In order to explore the expressivity of ICNNs, we provide necessary and sufficient conditions for a ReLU neural network to be convex. Such characterizations are based on product of weights and activations, and write nicely for any architecture in the path-lifting framework. As particular applications, we study our characterizations in depth for 1 and 2-hidden-layer neural networks: we show that every convex function implemented by a 1-hidden-layer ReLU network can be also expressed by an ICNN with the same architecture; however this property no longer holds with more layers. Finally, we provide a numerical procedure that allows an exact check of convexity for ReLU neural networks with a large number of affine regions.

翻译：凸函数及其梯度在数学成像中发挥着关键作用，从邻近优化到最优输运皆然。深度学习的成功促使许多研究者采用基于学习的方法，即用学习得到的神经网络替代固定的函数或算子。尽管这些方法在实证上表现优异，但为其建立严格的理论保证通常需要对神经架构施加结构性约束，特别是凸性约束。目前最流行的实现方式是使用所谓的输入凸神经网络（ICNNs）。为探究ICNNs的表达能力，本文给出了ReLU神经网络具有凸性的充分必要条件。该特征刻画基于权重与激活值的乘积运算，并可在路径提升框架中为任意架构提供简洁表述。作为具体应用，我们针对单隐层和双隐层神经网络进行了深入分析：证明所有由单隐层ReLU网络实现的凸函数均可由相同架构的ICNN表达；但该性质在更多层数时不再成立。最后，我们提出一种数值化方法，可对具有大量仿射区域的ReLU神经网络进行精确的凸性检验。

相关内容

Neural Networks

关注 1651

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日