Diffusion models have risen as a powerful tool in robotics due to their flexibility and multi-modality. While some of these methods effectively address complex problems, they often depend heavily on inference-time obstacle detection and require additional equipment. Addressing these challenges, we present a method that, during inference time, simultaneously generates only reachable goals and plans motions that avoid obstacles, all from a single visual input. Central to our approach is the novel use of a collision-avoiding diffusion kernel for training. Through evaluations against behavior-cloning and classical diffusion models, our framework has proven its robustness. It is particularly effective in multi-modal environments, navigating toward goals and avoiding unreachable ones blocked by obstacles, while ensuring collision avoidance. Project Website: https://sites.google.com/view/denoising-heat-inspired
翻译:扩散模型因其灵活性和多模态性已成为机器人学中的强大工具。尽管部分方法能有效解决复杂问题,但它们往往严重依赖推理时的障碍物检测,并需要额外设备。针对这些挑战,我们提出一种方法,在推理时仅凭单一视觉输入即可同时生成可达目标并规划避障运动。本方法的核心创新在于训练中采用了一种避免碰撞的扩散核。通过与行为克隆和经典扩散模型的对比评估,我们的框架证明了其鲁棒性,在多模态环境中尤为有效——既能导航至目标,又能规避被障碍物阻挡的不可达区域,同时确保无碰撞运动。项目网站:https://sites.google.com/view/denoising-heat-inspired