We investigate the expressivity and learning dynamics of bias-free ReLU networks. We firstly show that two-layer bias-free ReLU networks have limited expressivity: the only odd function two-layer bias-free ReLU networks can express is a linear one. We then show that, under symmetry conditions on the data, these networks have the same learning dynamics as linear networks. This allows us to give closed-form time-course solutions to certain two-layer bias-free ReLU networks, which has not been done for nonlinear networks outside the lazy learning regime. While deep bias-free ReLU networks are more expressive than their two-layer counterparts, they still share a number of similarities with deep linear networks. These similarities enable us to leverage insights from linear networks, leading to a novel understanding of bias-free ReLU networks. Overall, our results show that some properties established for bias-free ReLU networks arise due to equivalence to linear networks, and suggest that including bias or considering asymmetric data are avenues to engage with nonlinear behaviors.
翻译:本文研究了无偏置ReLU网络的表达能力与学习动态。我们首先证明双层无偏置ReLU网络表达能力有限:其所能表达的唯一奇函数仅为线性函数。随后我们证明,在数据满足对称性条件时,此类网络具有与线性网络完全一致的学习动态。这使得我们能够为特定双层无偏置ReLU网络提供闭式时间演化解——这是惰性学习机制之外的非线性网络研究中尚未实现的突破。虽然深层无偏置ReLU网络比双层结构具有更强的表达能力,但仍与深层线性网络存在诸多相似特性。这些相似性使我们能够借鉴线性网络的已有认知,从而形成对无偏置ReLU网络的新理解。总体而言,我们的研究结果表明:无偏置ReLU网络的某些特性源于其与线性网络的等效性,同时提示引入偏置项或考虑非对称数据是激发非线性行为的重要途径。