Set-based Neural Network Encoding Without Weight Tying

We propose a neural network weight encoding method for network property prediction that utilizes set-to-set and set-to-vector functions to efficiently encode neural network parameters. Our approach is capable of encoding neural networks in a model zoo of mixed architecture and different parameter sizes as opposed to previous approaches that require custom encoding models for different architectures. Furthermore, our \textbf{S}et-based \textbf{N}eural network \textbf{E}ncoder (SNE) takes into consideration the hierarchical computational structure of neural networks. To respect symmetries inherent in network weight space, we utilize Logit Invariance to learn the required minimal invariance properties. Additionally, we introduce a \textit{pad-chunk-encode} pipeline to efficiently encode neural network layers that is adjustable to computational and memory constraints. We also introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture. In cross-dataset property prediction, we evaluate how well property predictors generalize across model zoos trained on different datasets but of the same architecture. In cross-architecture property prediction, we evaluate how well property predictors transfer to model zoos of different architecture not seen during training. We show that SNE outperforms the relevant baselines on standard benchmarks.

翻译：我们提出了一种用于网络属性预测的神经网络权重编码方法，该方法利用集合到集合和集合到向量的函数来高效编码神经网络参数。与先前需要为不同架构定制编码模型的方法不同，我们的方法能够对混合架构和不同参数规模的模型库中的神经网络进行编码。此外，我们提出的基于集合的神经网络编码器（SNE）考虑了神经网络的分层计算结构。为尊重网络权重空间固有的对称性，我们利用Logit不变性来学习所需的最小不变性属性。同时，我们引入了一种可适应计算和内存限制的"填充-分块-编码"流水线，以高效编码神经网络层。我们还为神经网络属性预测引入了两项新任务：跨数据集和跨架构预测。在跨数据集属性预测中，我们评估属性预测器在相同架构但基于不同数据集训练的模型库间的泛化能力。在跨架构属性预测中，我们评估属性预测器向训练期间未见过的不同架构模型库的迁移能力。实验表明，SNE在标准基准测试中优于相关基线方法。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日