This work studies the design of neural networks that can process the weights or gradients of other neural networks, which we refer to as neural functional networks (NFNs). Despite a wide range of potential applications, including learned optimization, processing implicit neural representations, network editing, and policy evaluation, there are few unifying principles for designing effective architectures that process the weights of other networks. We approach the design of neural functionals through the lens of symmetry, in particular by focusing on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order. We introduce a framework for building permutation equivariant neural functionals, whose architectures encode these symmetries as an inductive bias. The key building blocks of this framework are NF-Layers (neural functional layers) that we constrain to be permutation equivariant through an appropriate parameter sharing scheme. In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks that require processing the weights of MLPs and CNNs, such as predicting classifier generalization, producing "winning ticket" sparsity masks for initializations, and editing the weights of implicit neural representations (INRs). In addition, we provide code for our models and experiments at https://github.com/AllanYangZhou/nfn.
翻译:本研究探讨能够处理其他神经网络权重或梯度的神经网络设计,我们将其称为神经函数网络(NFNs)。尽管存在广泛潜在应用,包括学习优化、处理隐式神经表示、网络编辑和策略评估,但目前缺乏统一原则来设计处理其他网络权重的有效架构。我们通过对称性视角设计神经函数网络,特别关注深度前馈网络权重中因隐藏层神经元无固有顺序而产生的置换对称性。我们提出一个构建置换等变神经函数网络的框架,其架构将这些对称性编码为归纳偏置。该框架的关键构建模块是约束为通过适当参数共享方案实现置换等变的NF-Layers(神经函数层)。实验表明,置换等变神经函数网络在处理多层感知器和卷积神经网络权重的多种任务上表现有效,例如预测分类器泛化能力、生成用于初始化的"中奖彩票"稀疏掩码、以及编辑隐式神经表示(INRs)的权重。此外,我们在https://github.com/AllanYangZhou/nfn 提供模型与实验代码。