We propose an approach to neural network weight encoding for generalization performance prediction that utilizes set-to-set and set-to-vector functions to efficiently encode neural network parameters. Our approach is capable of encoding neural networks in a modelzoo of mixed architecture and different parameter sizes as opposed to previous approaches that require custom encoding models for different architectures. Furthermore, our \textbf{S}et-based \textbf{N}eural network \textbf{E}ncoder (SNE) takes into consideration the hierarchical computational structure of neural networks by utilizing a layer-wise encoding scheme that culminates to encoding all layer-wise encodings to obtain the neural network encoding vector. Additionally, we introduce a \textit{pad-chunk-encode} pipeline to efficiently encode neural network layers that is adjustable to computational and memory constraints. We also introduce two new tasks for neural network generalization performance prediction: cross-dataset and cross-architecture. In cross-dataset performance prediction, we evaluate how well performance predictors generalize across modelzoos trained on different datasets but of the same architecture. In cross-architecture performance prediction, we evaluate how well generalization performance predictors transfer to modelzoos of different architecture. Experimentally, we show that SNE outperforms the relevant baselines on the cross-dataset task and provide the first set of results on the cross-architecture task.
翻译:我们提出一种面向泛化性能预测的神经网络权重编码方法,该方法利用集合到集合与集合到向量的函数高效编码神经网络参数。与以往针对不同架构需要定制编码模型的方法不同,我们的方法能够对混合架构、不同参数规模的模型库中的神经网络进行统一编码。此外,本文提出的基于集合的神经网络编码器(SNE)通过采用分层编码方案,将各层级编码汇聚为整体神经网络的编码向量,从而充分考虑了神经网络的层次化计算结构。同时,我们引入了可适应计算与内存约束的高效神经网络层级编码流程:填充-分块-编码(pad-chunk-encode)流水线。我们还提出了两项神经网络泛化性能预测的新任务:跨数据集预测与跨架构预测。在跨数据集性能预测中,我们评估性能预测器对不同数据集训练但架构相同的模型库的泛化能力;在跨架构性能预测中,则评估预测器向不同架构模型库迁移的效果。实验结果表明,SNE在跨数据集任务上优于相关基线方法,并在跨架构任务上提供了首批基准结果。