Neural Network Diffusion

from arxiv, We introduce a novel approach for parameter generation, named neural network diffusion (\textbf{p-diff}, p stands for parameter), which employs a standard latent diffusion model to synthesize a new set of parameters

Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an autoencoder and a standard latent diffusion model. The autoencoder extracts latent representations of a subset of the trained network parameters. A diffusion model is then trained to synthesize these latent parameter representations from random noise. It then generates new representations that are passed through the autoencoder's decoder, whose outputs are ready to use as new subsets of network parameters. Across various architectures and datasets, our diffusion process consistently generates models of comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models perform differently with the trained networks. Our results encourage more exploration on the versatile use of diffusion models.

翻译：扩散模型在图像和视频生成领域取得了显著成功。本研究证明，扩散模型同样能够\emph{生成高性能的神经网络参数}。我们的方法简洁有效，采用自动编码器与标准潜空间扩散模型相结合：自动编码器提取训练好的网络参数子集的潜空间表征，扩散模型则在此基础上学习从随机噪声合成这些潜参数表征。随后，该模型生成的新表征经自动编码器解码后，可直接作为新的网络参数子集使用。在多种架构与数据集上的实验表明，该扩散过程能以极低的额外成本，持续生成性能与训练网络相当甚至更优的模型。值得注意的是，实证研究发现生成模型与训练网络的性能表现存在差异。这一成果鼓励我们进一步探索扩散模型的多元化应用。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日