Operating effectively in complex environments while complying with specified constraints is crucial for the safe and successful deployment of robots that interact with and operate around people. In this work, we focus on generating long-horizon trajectories that adhere to novel static and temporally-extended constraints/instructions at test time. We propose a data-driven diffusion-based framework, LTLDoG, that modifies the inference steps of the reverse process given an instruction specified using finite linear temporal logic ($\text{LTL}_f$). LTLDoG leverages a satisfaction value function on $\text{LTL}_f$ and guides the sampling steps using its gradient field. This value function can also be trained to generalize to new instructions not observed during training, enabling flexible test-time adaptability. Experiments in robot navigation and manipulation illustrate that the method is able to generate trajectories that satisfy formulae that specify obstacle avoidance and visitation sequences.
翻译:在复杂环境中有效运行并满足指定约束条件,对于在人类周围交互和操作的机器人安全成功部署至关重要。本研究聚焦于在测试阶段生成符合新型静态及时域扩展约束/指令的长视界轨迹。我们提出基于数据驱动的扩散框架LTLDoG,该框架根据有限线性时序逻辑($\text{LTL}_f$)指定的指令,修改逆过程的推理步骤。LTLDoG利用$\text{LTL}_f$上的满足度价值函数,通过其梯度场引导采样步骤。该价值函数还可训练推广至训练中未见过的新指令,实现灵活的测试时适应性。机器人导航和操作实验表明,该方法能生成满足避障和访问序列公式的轨迹。