Model merging combines multiple homologous models into one model, achieving convincing generalization without the necessity of additional training. A key challenge in this problem is resolving parameter redundancies and conflicts across multiple models. Existing models have demonstrated that dropping a portion of delta parameters can alleviate conflicts while maintaining performance. However, these methods often drop parameters either randomly or based on magnitude, overlooking task-specific information embedded in fine-tuned models. In this paper, we propose an Activated Parameter Locating (APL) method that utilizes causal intervention to estimate parameter importance, enabling more precise parameter drops and better conflict mitigation. Moreover, to reduce the computational complexity associated with a large number of parameter partitions, we also introduce a theoretically supported gradient approximation strategy for APL. Experiments on model merging within both in-domain and out-of-domain settings, along with associated analyses, showcase the effectiveness of APL.
翻译:模型融合将多个同源模型合并为一个模型,无需额外训练即可实现令人信服的泛化能力。该问题的核心挑战在于解决多个模型间的参数冗余与冲突。现有研究表明,丢弃部分增量参数可在保持性能的同时缓解冲突。然而,这些方法通常随机或基于参数幅度进行丢弃,忽视了精调模型中蕴含的任务特定信息。本文提出一种基于因果干预的激活参数定位方法,通过估计参数重要性实现更精确的参数丢弃与冲突缓解。此外,为降低大量参数分区的计算复杂度,我们还为该方法引入了一种具有理论支撑的梯度近似策略。在领域内与跨领域模型融合场景下的实验及相关分析,验证了该方法的有效性。