This work studies the design of neural networks that can process the weights or gradients of other neural networks, which we refer to as neural functional networks (NFNs). Despite a wide range of potential applications, including learned optimization, processing implicit neural representations, network editing, and policy evaluation, there are few unifying principles for designing effective architectures that process the weights of other networks. We approach the design of neural functionals through the lens of symmetry, in particular by focusing on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order. We introduce a framework for building permutation equivariant neural functionals, whose architectures encode these symmetries as an inductive bias. The key building blocks of this framework are NF-Layers (neural functional layers) that we constrain to be permutation equivariant through an appropriate parameter sharing scheme. In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks that require processing the weights of MLPs and CNNs, such as predicting classifier generalization, producing "winning ticket" sparsity masks for initializations, and classifying or editing implicit neural representations (INRs). In addition, we provide code for our models and experiments at https://github.com/AllanYangZhou/nfn.
翻译:本研究探讨如何设计能够处理其他神经网络权重或梯度的神经网络架构,这类网络被称为神经函数网络(NFNs)。尽管其具有广泛的应用前景,包括学习优化、隐式神经表征处理、网络编辑和策略评估,但针对设计处理其他网络权重的有效架构,目前仍缺乏统一的指导原则。我们通过对称性视角研究神经函数的设计,特别关注深度前馈网络权重的置换对称性——由于隐层神经元不存在固有顺序。我们提出构建置换等变神经函数的框架,其架构将这些对称性编码为归纳偏置。该框架的核心构件是NF层(神经函数层),我们通过参数共享机制将其约束为置换等变。实验表明,置换等变神经函数在多类需要处理MLP和CNN权重的任务上表现优异,包括预测分类器泛化性能、生成初始化参数的"中奖彩票"稀疏掩码,以及隐式神经表征(INRs)的分类与编辑。此外,我们在https://github.com/AllanYangZhou/nfn 提供模型与实验代码。