We introduce a novel approach to the executable semantic object rearrangement problem. In this challenge, a robot seeks to create an actionable plan that rearranges objects within a scene according to a pattern dictated by a natural language description. Unlike existing methods such as StructFormer and StructDiffusion, which tackle the issue in two steps by first generating poses and then leveraging a task planner for action plan formulation, our method concurrently addresses pose generation and action planning. We achieve this integration using a Language-Guided Monte-Carlo Tree Search (LGMCTS). Quantitative evaluations are provided on two simulation datasets, and complemented by qualitative tests with a real robot.
翻译:本文针对可执行语义物体重排问题提出了一种新方法。该任务要求机器人根据自然语言描述的模式,制定可执行的规划以重新排列场景中的物体。与StructFormer和StructDiffusion等现有方法(其采用首先生成位姿、再利用任务规划器制定动作规划的两阶段处理方式)不同,我们的方法同步处理位姿生成与动作规划。我们通过语言引导蒙特卡洛树搜索(LGMCTS)实现了这一整合。我们在两个仿真数据集上进行了定量评估,并辅以真实机器人的定性实验加以验证。