Generating human motion that satisfies customized zero-shot goal functions, enabling applications such as controllable character animation and behavior synthesis for virtual agents, is a critical capability. While current approaches handle many unseen constraints, they fail on tasks with very challenging spatiotemporal restrictions, such as severe spatial obstacles or specified numbers of walking steps. To equip motion generators for these highly constrained tasks, we present a retrieval-guided method built on the training-free diffusion noise optimization framework. The key idea is to search within large motion datasets for guidance that can potentially satisfy difficult constraints. We introduce relational task parsing to group target constraints and identify the difficult ones to be handled by retrieved reference. A better initialization for diffusion noise is then obtained via a reward-guided mask that combines random noise with retrieved noise. By optimizing diffusion noise from this improved initialization, we successfully solve highly constrained generation tasks. By leveraging LLM for relational task parsing, the whole framework is further enabled to automatically reason for what to retrieve, improving the intelligence of moving agents under a training-free optimization scheme.
翻译:生成满足定制化零样本目标函数的人体运动是实现可控角色动画、虚拟智能体行为合成等应用的关键能力。当前方法虽能处理多种未知约束,但在应对高难度时空限制任务(如严苛空间障碍或指定步行步数)时仍存在局限。为提升运动生成器处理此类高度约束任务的能力,我们提出一种基于无训练扩散噪声优化框架的检索引导方法。其核心思想是在大规模运动数据集中搜索可满足困难约束的潜在引导信息。通过引入关系型任务解析技术对目标约束进行分组,识别需要借助检索参考处理的困难约束;随后利用奖励引导掩码融合随机噪声与检索噪声,获得更优的扩散噪声初始化。基于这种改进初始化的扩散噪声优化,我们成功攻克了高度约束条件下的生成任务。结合大语言模型进行关系型任务解析后,整个框架可自动推理检索目标,在无需训练的优化机制下提升运动智能体的自主决策能力。