We propose Diffusion Noise Optimization (DNO), a new method that effectively leverages existing motion diffusion models as motion priors for a wide range of motion-related tasks. Instead of training a task-specific diffusion model for each new task, DNO operates by optimizing the diffusion latent noise of an existing pre-trained text-to-motion model. Given the corresponding latent noise of a human motion, it propagates the gradient from the target criteria defined on the motion space through the whole denoising process to update the diffusion latent noise. As a result, DNO supports any use cases where criteria can be defined as a function of motion. In particular, we show that, for motion editing and control, DNO outperforms existing methods in both achieving the objective and preserving the motion content. DNO accommodates a diverse range of editing modes, including changing trajectory, pose, joint locations, or avoiding newly added obstacles. In addition, DNO is effective in motion denoising and completion, producing smooth and realistic motion from noisy and partial inputs. DNO achieves these results at inference time without the need for model retraining, offering great versatility for any defined reward or loss function on the motion representation.
翻译:我们提出扩散噪声优化(DNO)方法,该方法能有效利用现有运动扩散模型作为运动先验,适用于广泛的运动相关任务。不同于为每个新任务训练专用扩散模型,DNO通过优化现有预训练文本-运动模型的扩散潜噪声来运作。给定人体运动的对应潜噪声,DNO将定义在运动空间上的目标准则梯度,通过整个去噪过程反向传播以更新扩散潜噪声。因此,DNO支持任何可将准则定义为运动函数的应用场景。特别地,在运动编辑与控制任务中,DNO在实现目标和保持运动内容方面均优于现有方法。该方法可适应多种编辑模式,包括轨迹修改、姿态调整、关节点移动或规避新增障碍物。此外,DNO在运动去噪与补全方面同样有效,能从噪声输入和部分输入生成流畅自然的完整运动。DNO无需模型重训练即可在推理阶段实现这些效果,为运动表征上定义的任意奖励函数或损失函数提供了极大的通用性。