Learning in weight spaces, where neural networks process the weights of other deep neural networks, has emerged as a promising research direction with applications in various fields, from analyzing and editing neural fields and implicit neural representations, to network pruning and quantization. Recent works designed architectures for effective learning in that space, which takes into account its unique, permutation-equivariant, structure. Unfortunately, so far these architectures suffer from severe overfitting and were shown to benefit from large datasets. This poses a significant challenge because generating data for this learning setup is laborious and time-consuming since each data sample is a full set of network weights that has to be trained. In this paper, we address this difficulty by investigating data augmentations for weight spaces, a set of techniques that enable generating new data examples on the fly without having to train additional input weight space elements. We first review several recently proposed data augmentation schemes %that were proposed recently and divide them into categories. We then introduce a novel augmentation scheme based on the Mixup method. We evaluate the performance of these techniques on existing benchmarks as well as new benchmarks we generate, which can be valuable for future studies.
翻译:在权重空间中进行学习(即神经网络处理其他深度神经网络的权重)已成为一个前景广阔的研究方向,其应用涵盖多个领域,从分析和编辑神经场与隐式神经表示,到网络剪枝与量化。近期研究设计了在考虑该空间独特排列等变性结构的条件下进行高效学习的架构。不幸的是,迄今为止这些架构严重存在过拟合问题,且被证明需要依赖大规模数据集。由于每个数据样本都是需要完整训练的神经网络权重集合,为这种学习范式生成数据既费时又费力,这使得数据获取成为重大挑战。本文通过探索权重空间的数据增强技术来解决这一难题——这些技术能够动态生成新数据样本而无需额外训练输入权重空间元素。我们首先梳理了近期提出的多种数据增强方案并进行分类,随后引入一种基于Mixup方法的新型增强方案。我们在现有基准测试以及本文生成的新基准测试上评估了这些技术的性能,这些基准测试对未来的研究具有重要参考价值。