Navigating the capability-efficiency trade-offs in Large Language Models (LLMs) requires constructing a high-quality Pareto set. However, existing merging techniques remain inadequate: coarse-grained, model-level methods yield only a sparse set of suboptimal solutions, while fine-grained, layer-wise optimization suffers from the curse of dimensionality, especially under tight evaluation budgets where each model candidate is costly to assess. We propose Bayesian Model Merging with Structural Importance Prior (SIP-BMM), an evolutionary loop framework driven by Log-Noisy Expected Hypervolume Improvement ($q$NEHVI) that makes layer-wise Pareto set construction tractable by explicitly modeling which layers matter. Specifically, SIP-BMM derives a \textbf{Structural Importance Prior (SIP)} from layer-wise task-vector differences between base and expert models, and uses this prior to Bayesian Optimization toward a low effective dimensional subspace. Intuitively, SIP steers the optimizer to spend most trials on a small set of influential layers while largely ignoring layers that exhibit minimal task-relevant shifts. This importance-aware search preserves layer-wise control while substantially reducing sample complexity. Experiments show that SIP-BMM discovers a stronger and denser Pareto front than competitive baselines, enabling agile model selection under diverse operational constraints. Code is available at: https://github.com/MiLab-HITSZ/2026-SIPBMM.
翻译:在大语言模型(LLMs)中权衡能力与效率需要构建一个高质量的帕累托集。然而,现有的模型融合技术仍显不足:粗粒度的模型级方法仅能产生稀疏的次优解集,而细粒度的逐层优化则受困于维度灾难,尤其是在评估预算紧张、每个候选模型评估成本高昂的情况下。我们提出了基于结构重要性先验的贝叶斯模型融合(SIP-BMM),这是一个由对数噪声期望超体积改进($q$NEHVI)驱动的进化循环框架,它通过显式建模哪些层是重要的,使得逐层帕累托集构建变得可行。具体而言,SIP-BMM 从基础模型与专家模型之间的逐层任务向量差异中推导出**结构重要性先验(SIP)**,并利用此先验引导贝叶斯优化朝向一个低有效维度的子空间。直观上,SIP 引导优化器将大部分试验集中在少数具有影响力的层上,同时基本忽略那些表现出最小任务相关变化的层。这种重要性感知搜索在保留逐层控制的同时,显著降低了样本复杂度。实验表明,与竞争基线方法相比,SIP-BMM 能够发现更强、更密集的帕累托前沿,从而能够在多样化的操作约束下实现敏捷的模型选择。代码发布于:https://github.com/MiLab-HITSZ/2026-SIPBMM。