Model merging is to combine fine-tuned models derived from multiple domains, with the intent of enhancing the model's proficiency across various domains. The principal concern is the resolution of parameter conflicts. A substantial amount of existing research remedy this issue during the merging stage, with the latest study focusing on resolving this issue throughout the pruning stage. The DARE approach has exhibited promising outcomes when applied to a simplistic fine-tuned model. However, the efficacy of this method tends to wane when employed on complex fine-tuned models that show a significant parameter bias relative to the baseline model. In this paper, we introduce a dual-stage method termed Dynamic Pruning Partition Amplification (DPPA), devised to tackle the challenge of merging complex fine-tuned models. Initially, we introduce Dynamically Pruning (DP), an improved approach based on magnitude pruning, which aim is to enhance performance at higher pruning rates. Subsequently, we propose Dynamically Partition Amplification (DPA), a rescaling strategy, is designed to dynamically amplify parameter partitions in relation to their significance levels. The experimental results show that our method maintains a mere 20% of domain-specific parameters and yet delivers a performance comparable to other methodologies that preserve up to 90% of parameters. Furthermore, our method displays outstanding performance post-pruning, leading to a significant improvement of nearly 20% performance in model merging. We make our code on Github.
翻译:模型合并旨在整合来自多个领域的微调模型,以提升模型在多领域中的综合能力,其核心问题在于解决参数冲突。现有大量研究在合并阶段解决该问题,而最新研究则聚焦于在剪枝阶段化解冲突。DARE方法在简单微调模型上取得了显著成效,但当应用于与基线模型存在较大参数偏差的复杂微调模型时,该方法的效果往往减弱。本文提出一种名为动态剪枝分区放大(DPPA)的双阶段方法,旨在应对复杂微调模型合并的挑战。首先,我们引入动态剪枝(DP)——一种基于幅度剪枝的改进方法,旨在提升高剪枝率下的性能;随后,提出动态分区放大(DPA)——一种重缩放策略,用于根据参数分区的重要性水平动态放大其权重。实验结果表明,本方法仅保留20%的领域特定参数,即可达到与保留高达90%参数的其他方法相媲美的性能。此外,本方法在剪枝后展现出卓越表现,使模型合并性能提升近20%。我们已在GitHub上公开代码。