Learning across domains is challenging when data cannot be centralized due to privacy or heterogeneity, which limits the ability to train a single comprehensive model. Model merging provides an appealing alternative by consolidating knowledge from multiple specialized models into one, avoiding data sharing and reducing retraining cost. In this work, we present DMM, a data-free model merging framework designed to handle highly divergent models. DMM proceeds in three steps. First, domain-specific models are trained independently. Second, models with high similarity are merged using standard techniques to ensure stability. Third, we synthesize pseudo-data from normalization statistics and distill knowledge from divergent models into the merged model through a lightweight refinement guided by these samples. This approach preserves rare but critical knowledge while maintaining stability. Extensive experiments on unimodal and multimodal benchmarks show that DMM achieves state-of-the-art performance over existing merging methods.
翻译:当数据因隐私或异构性而无法集中时,跨领域学习面临严峻挑战,这限制了训练单一综合模型的能力。模型合并通过将多个专用模型的知识整合至单一模型中,提供了一种具有吸引力的替代方案,既能避免数据共享又可降低重训练成本。本研究提出DMM——一种专为处理高度异构模型而设计的无数据模型合并框架。DMM包含三个步骤:首先独立训练各领域专用模型;其次采用标准技术合并高相似度模型以确保稳定性;最后基于归一化统计量合成伪数据,并通过这些样本引导的轻量化精炼过程,将异构模型的知识蒸馏至已合并模型中。该方法在保持稳定性的同时,保留了稀缺但关键的知识。在单模态与多模态基准测试上的大量实验表明,DMM在现有合并方法中实现了最先进的性能。