Fine-tuning-based adaptation is widely used to customize diffusion-based image generation, leading to large collections of community-created adapters that capture diverse subjects and styles. Adapters derived from the same base model can be merged with weights, enabling the synthesis of new visual results within a vast and continuous design space. To explore this space, current workflows rely on manual slider-based tuning, an approach that scales poorly and makes weight selection difficult, even when the candidate set is limited to 20-30 adapters. We propose GimmBO to support interactive exploration of adapter merging for image generation through Preferential Bayesian Optimization (PBO). Motivated by observations from real-world usage, including sparsity and constrained weight ranges, we introduce a two-stage BO backend that improves sampling efficiency and convergence in high-dimensional spaces. We evaluate our approach with simulated users and a user study, demonstrating improved convergence, high success rates, and consistent gains over BO and line-search baselines, and further show the flexibility of the framework through several extensions.
翻译:基于微调的适配方法被广泛用于定制扩散式图像生成,催生了大量社区创建的涵盖多样主题与风格的适配器。源自同一基础模型的适配器可通过权重进行合并,从而在广阔且连续的参数空间内合成新的视觉结果。为探索这一空间,当前工作流程依赖手动滑块调参,这种方法扩展性差且难以选择权重——即使候选适配器集合仅包含20-30个。我们提出GimmBO,通过偏好贝叶斯优化(PBO)支持对图像生成的适配器融合进行交互式探索。受实际使用中的观察(包括稀疏性和受限权重范围)启发,我们引入两阶段贝叶斯优化后端,提升了高维空间中的采样效率与收敛性。我们通过模拟用户与用户研究评估该方法,展示了相较于贝叶斯优化与线性搜索基线方法的更优收敛性、高成功率与持续改进,并通过多项扩展验证了框架的灵活性。