Learning across domains is challenging when data cannot be centralized due to privacy or heterogeneity, which limits the ability to train a single comprehensive model. Model merging provides an appealing alternative by consolidating knowledge from multiple specialized models into one, avoiding data sharing and reducing retraining cost. In this work, we present DMM, a data-free model merging framework designed to handle highly divergent models. DMM proceeds in three steps. First, domain-specific models are trained independently. Second, models with high similarity are merged using standard techniques to ensure stability. Third, we synthesize pseudo-data from normalization statistics and distill knowledge from divergent models into the merged model through a lightweight refinement guided by these samples. This approach preserves rare but critical knowledge while maintaining stability. Extensive experiments on unimodal and multimodal benchmarks show that DMM achieves state-of-the-art performance over existing merging methods.
翻译:当数据因隐私或异构性无法集中时,跨领域学习面临挑战,这限制了单一综合模型的训练能力。模型合并通过将多个专业模型的知识整合为一个模型,提供了一种有吸引力的替代方案,既避免了数据共享,又降低了再训练成本。本文提出DMM——一种无需数据即可处理高度差异化模型的模型合并框架。DMM分三步执行:首先,独立训练领域特定模型;其次,采用标准技术合并相似度高的模型以确保稳定性;最后,从归一化统计量中合成伪数据,并基于这些样本通过轻量级微调,将差异模型的知识蒸馏至合并模型中。该方法在保持稳定性的同时,保留了罕见但关键的知识。在单模态与多模态基准上的大量实验表明,DMM在现有合并方法中实现了最先进的性能。