Input-Convex Neural Networks (ICNNs) are networks that guarantee convexity in their input-output mapping. These networks have been successfully applied for energy-based modelling, optimal transport problems and learning invariances. The convexity of ICNNs is achieved by using non-decreasing convex activation functions and non-negative weights. Because of these peculiarities, previous initialisation strategies, which implicitly assume centred weights, are not effective for ICNNs. By studying signal propagation through layers with non-negative weights, we are able to derive a principled weight initialisation for ICNNs. Concretely, we generalise signal propagation theory by removing the assumption that weights are sampled from a centred distribution. In a set of experiments, we demonstrate that our principled initialisation effectively accelerates learning in ICNNs and leads to better generalisation. Moreover, we find that, in contrast to common belief, ICNNs can be trained without skip-connections when initialised correctly. Finally, we apply ICNNs to a real-world drug discovery task and show that they allow for more effective molecular latent space exploration.
翻译:输入凸神经网络(ICNNs)是一类在输入-输出映射中保证凸性的网络。这类网络已成功应用于基于能量的建模、最优输运问题以及学习不变性等领域。ICNNs通过使用非递减凸激活函数和非负权重实现凸性。由于这些特殊性,先前隐式假设权重中心化的初始化策略对ICNNs并不有效。通过研究信号在非负权重层中的传播,我们推导出适用于ICNNs的正则化权重初始化方法。具体而言,我们通过移除权重从中心化分布采样的假设,推广了信号传播理论。在系列实验中,我们证明该正则化初始化能有效加速ICNNs的学习过程,并带来更好的泛化性能。此外,我们发现与普遍认知相反,当正确初始化时,ICNNs无需跳跃连接即可训练。最后,我们将ICNNs应用于真实药物发现任务,证明其能更有效地探索分子潜空间。