Model merging combines multiple models into a single model with aggregated capabilities, making it a powerful tool for large language model (LLM) development. However, scaling model merging is challenging: performance depends on the choice of merge operator, model subset, and merge order, often requiring expensive merge-and-evaluate searches. In this work, we introduce SimMerge, a predictive merge-selection method that identifies high-performing merges using inexpensive, task-agnostic similarity signals between models. Given a small set of unlabeled probes, SimMerge extracts functional and structural features to predict the performance of candidate two-way merges, enabling merge operator, order and model subset selection without iterative evaluation. We show that SimMerge consistently outperforms the best fixed merge operator across 7B-parameter LLMs and generalizes to multi-way merges and 111B-parameter LLMs without retraining. We further introduce a bandit variant that supports adding new tasks and operators online. Our results suggest that learning how to merge enables scalable model composition when checkpoint catalogs are large and evaluation budgets are limited.
翻译:模型合并将多个模型融合为具备聚合能力的单一模型,成为大型语言模型(LLM)开发的有力工具。然而,扩展模型合并面临挑战:其性能取决于合并算子的选择、模型子集及合并顺序,通常需要进行昂贵的合并-评估搜索。本文提出SimMerge——一种预测性合并选择方法,通过模型间廉价且任务无关的相似性信号来识别高性能合并方案。给定少量未标注探针样本,SimMerge提取功能与结构特征以预测候选双向合并的性能,从而无需迭代评估即可实现合并算子、顺序及模型子集的选择。实验表明,在7B参数规模的LLM上,SimMerge持续优于最佳固定合并算子,且无需重新训练即可推广至多路合并及111B参数规模的LLM。我们进一步提出支持在线新增任务与算子的赌博机变体。研究结果表明,当检查点目录规模庞大且评估预算有限时,学习如何合并能够实现可扩展的模型组合。