Recent diffusion editors perform diverse instruction-based edits while conditioning on the source image at every denoising step. Yet persistent source-image conditioning can limit how fully an edit is executed and how natural the result appears, especially when the target scene diverges substantially from the input. We introduce DuET (Dual Expert Trajectories), a training-free inference method that temporarily relaxes source-image conditioning by transitioning through a text-to-image phase before returning to edit mode, allowing the denoising trajectory to move toward the target distribution while retaining the structural benefits of image-conditioned editing. Without modifying model weights or increasing sampling cost, DuET consistently improves instruction relevance, semantic fidelity, and perceptual quality across diverse models and benchmarks. In some cases, these gains come with a modest reduction in source-image preservation, revealing a predictable trade-off between source preservation and edit fidelity.
翻译:近期扩散编辑器在每次去噪步骤中基于源图像进行多样化指令编辑,但持续依赖源图像可能会限制编辑的完整性及结果的自然度,尤其在目标场景与输入存在显著差异时。我们提出DuET(双专家轨迹),一种无需训练的推理方法,通过过渡到文本到图像生成阶段再返回编辑模式,暂时放松对源图像的依赖,使去噪轨迹在保持图像条件编辑结构优势的同时向目标分布迁移。在不修改模型权重或增加采样成本的前提下,DuET在不同模型与基准测试中持续提升指令相关性、语义保真度与感知质量。在某些情况下,这些改进会伴随源图像保留程度的小幅下降,揭示出源保留与编辑保真度之间的可预测权衡。