Generative models have achieved remarkable success in image, video, and text domains. Inspired by this, researchers have explored utilizing generative models to generate neural network parameters. However, these efforts have been limited by the parameter size and the practicality of generating high-performance parameters. In this paper, we propose COND P-DIFF, a novel approach that demonstrates the feasibility of controllable high-performance parameter generation, particularly for LoRA (Low-Rank Adaptation) weights, during the fine-tuning process. Specifically, we employ an autoencoder to extract efficient latent representations for parameters. We then train a conditional latent diffusion model to synthesize high-performing model parameters from random noise based on specific task conditions. Experimental results in both computer vision and natural language processing domains consistently demonstrate that COND P-DIFF can generate high-performance parameters conditioned on the given task. Moreover, we observe that the parameter distribution generated by COND P-DIFF exhibits differences compared to the distribution obtained through normal optimization methods, indicating a certain level of generalization capability. Our work paves the way for further exploration of condition-driven parameter generation, offering a promising direction for task-specific adaptation of neural networks.
翻译:生成模型在图像、视频和文本领域已取得显著成功。受此启发,研究者们开始探索利用生成模型来生成神经网络参数。然而,这些尝试一直受到参数规模以及生成高性能参数的实际可行性的限制。本文提出COND P-DIFF,一种新颖的方法,证明了在微调过程中可控的高性能参数生成(特别是针对LoRA权重)的可行性。具体而言,我们采用自编码器提取参数的高效潜在表示,然后训练一个条件潜在扩散模型,使其能够基于特定任务条件从随机噪声中合成高性能的模型参数。在计算机视觉和自然语言处理领域的实验结果一致表明,COND P-DIFF能够根据给定任务条件生成高性能参数。此外,我们观察到COND P-DIFF生成的参数分布与通过常规优化方法获得的分布存在差异,这表明其具备一定程度的泛化能力。我们的工作为进一步探索条件驱动的参数生成铺平了道路,为神经网络的任务特定适应提供了一个有前景的研究方向。