Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an autoencoder and a standard latent diffusion model. The autoencoder extracts latent representations of a subset of the trained network parameters. A diffusion model is then trained to synthesize these latent parameter representations from random noise. It then generates new representations that are passed through the autoencoder's decoder, whose outputs are ready to use as new subsets of network parameters. Across various architectures and datasets, our diffusion process consistently generates models of comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models perform differently with the trained networks. Our results encourage more exploration on the versatile use of diffusion models.
翻译:扩散模型在图像和视频生成领域取得了显著成功。本研究证明,扩散模型同样能够\emph{生成高性能的神经网络参数}。我们的方法简洁有效,采用自动编码器与标准潜空间扩散模型相结合:自动编码器提取训练好的网络参数子集的潜空间表征,扩散模型则在此基础上学习从随机噪声合成这些潜参数表征。随后,该模型生成的新表征经自动编码器解码后,可直接作为新的网络参数子集使用。在多种架构与数据集上的实验表明,该扩散过程能以极低的额外成本,持续生成性能与训练网络相当甚至更优的模型。值得注意的是,实证研究发现生成模型与训练网络的性能表现存在差异。这一成果鼓励我们进一步探索扩散模型的多元化应用。