Collaborative generation, which coordinates multiple diffusion trajectories to extend the capabilities of pretrained priors, has emerged as a powerful paradigm for extending the applicability of diffusion models. Among existing approaches, diffusion synchronization provides a scenario-agnostic solution by introducing general guidance mechanisms. However, current synchronization approaches rely heavily on heuristics and still require task-specific tailoring, which limits their generalizability and performance. In this work, we mathematically derive a synchronization framework based on optimal control, providing a principled explanation of diffusion synchronization. During sampling, we optimize control variables to guide multiple trajectories toward coherent solutions while remaining close to the underlying diffusion prior. Our method operates entirely at test-time without additional training, thereby enabling broad applicability across diverse generation scenarios when combined with strong pretrained priors. We demonstrate consistent improvements over baselines on three representative collaborative generation tasks, covering a wide range of modalities and applications. Beyond performance gains, our work establishes a novel foundation for collaborative generation, opening a principled path toward extending pretrained generative models to new collaborative generation settings.
翻译:协同生成通过协调多个扩散轨迹来扩展预训练先验的能力,已成为拓展扩散模型适用性的强大范式。现有方法中,扩散同步通过引入通用引导机制提供了与场景无关的解决方案。然而,当前同步方法严重依赖启发式策略且仍需任务特定调整,这限制了其泛化能力和性能表现。本研究基于最优控制理论数学推导出同步框架,为扩散同步提供了原理性解释。在采样过程中,我们优化控制变量以引导多个轨迹收敛至一致解,同时保持与底层扩散先验的接近。该方法完全在测试时运行且无需额外训练,因此结合强预训练先验即可广泛适用于多样化生成场景。我们在三个代表性协同生成任务(涵盖多种模态与应用场景)中展示了相较于基线方法的一致性能提升。除性能改进外,本研究为协同生成建立了全新理论基础,开辟了将预训练生成模型扩展至新型协同生成设置的原理性路径。