Collaborative generation, which coordinates multiple diffusion trajectories to extend the capabilities of pretrained priors, has emerged as a powerful paradigm for extending the applicability of diffusion models. Among existing approaches, diffusion synchronization provides a scenario-agnostic solution by introducing general guidance mechanisms. However, current synchronization approaches rely heavily on heuristics and still require task-specific tailoring, which limits their generalizability and performance. In this work, we mathematically derive a synchronization framework based on optimal control, providing a principled explanation of diffusion synchronization. During sampling, we optimize control variables to guide multiple trajectories toward coherent solutions while remaining close to the underlying diffusion prior. Our method operates entirely at test-time without additional training, thereby enabling broad applicability across diverse generation scenarios when combined with strong pretrained priors. We demonstrate consistent improvements over baselines on three representative collaborative generation tasks, covering a wide range of modalities and applications. Beyond performance gains, our work establishes a novel foundation for collaborative generation, opening a principled path toward extending pretrained generative models to new collaborative generation settings.
翻译:协作生成通过协调多个扩散轨迹来扩展预训练先验的能力,已成为增强扩散模型适用性的强大范式。在现有方法中,扩散同步通过引入通用引导机制提供了场景无关的解决方案。然而,当前同步方法严重依赖启发式策略,仍需针对特定任务定制,这限制了其泛化能力和性能表现。在本工作中,我们基于最优控制理论数学推导出一个同步框架,为扩散同步提供了原理性解释。在采样过程中,我们优化控制变量以引导多个轨迹趋向一致解,同时保持与底层扩散先验的接近程度。我们的方法完全在测试时运行,无需额外训练,因此当与强大的预训练先验结合时,可广泛适用于多样化生成场景。我们在三个具有代表性的协作生成任务上(涵盖多种模态与应用领域)展示了相比基线方法的一致改进。除性能提升外,本工作为协作生成建立了全新理论基础,为将预训练生成模型扩展至新型协作生成场景开辟了原理性路径。