We introduce GRS (Generating Robotic Simulation tasks), a system addressing real-to-sim for robotic simulations. GRS creates digital twin simulations from single RGB-D observations with solvable tasks for virtual agent training. Using vision-language models (VLMs), our pipeline operates in three stages: 1) scene comprehension with SAM2 for segmentation and object description, 2) matching objects with simulation-ready assets, and 3) generating appropriate tasks. We ensure simulation-task alignment through generated test suites and introduce a router that iteratively refines both simulation and test code. Experiments demonstrate our system's effectiveness in object correspondence and task environment generation through our novel router mechanism.
翻译:我们提出了GRS(生成机器人仿真任务)系统,旨在解决机器人仿真中的真实到仿真转换问题。GRS能够从单次RGB-D观测数据中创建数字孪生仿真环境,并为虚拟智能体训练提供可求解的任务。利用视觉语言模型(VLMs),我们的流程分为三个阶段:1)使用SAM2进行场景理解,实现分割与物体描述;2)将物体与仿真就绪资产进行匹配;3)生成适配的任务。我们通过生成测试套件确保仿真与任务的协同性,并引入一个路由器机制迭代优化仿真代码与测试代码。实验结果表明,通过我们提出的新型路由器机制,系统在物体对应关系与任务环境生成方面具有显著有效性。