We introduce a novel approach to the executable semantic object rearrangement problem. In this challenge, a robot seeks to create an actionable plan that rearranges objects within a scene according to a pattern dictated by a natural language description. Unlike existing methods such as StructFormer and StructDiffusion, which tackle the issue in two steps by first generating poses and then leveraging a task planner for action plan formulation, our method concurrently addresses pose generation and action planning. We achieve this integration using a Language-Guided Monte-Carlo Tree Search (LGMCTS). Quantitative evaluations are provided on two simulation datasets, and complemented by qualitative tests with a real robot.
翻译:我们提出了一种新颖的方法来解决可执行语义物体重排问题。在此挑战中,机器人需要根据自然语言描述所指定的模式,生成一个可操作的规划来重排场景中的物体。与现有方法(如StructFormer和StructDiffusion)分两步处理该问题——先生成姿态,再借助任务规划器制定行动规划——不同,我们的方法同时处理姿态生成和行动规划。我们通过语言引导的蒙特卡洛树搜索(LGMCTS)实现了这种集成。我们在两个模拟数据集上进行了定量评估,并通过真实机器人实验进行了定性测试。