Robots today can exploit the rich world knowledge of large language models to chain simple behavioral skills into long-horizon tasks. However, robots often get interrupted during long-horizon tasks due to primitive skill failures and dynamic environments. We propose VADER, a plan, execute, detect framework with seeking help as a new skill that enables robots to recover and complete long-horizon tasks with the help of humans or other robots. VADER leverages visual question answering (VQA) modules to detect visual affordances and recognize execution errors. It then generates prompts for a language model planner (LMP) which decides when to seek help from another robot or human to recover from errors in long-horizon task execution. We show the effectiveness of VADER with two long-horizon robotic tasks. Our pilot study showed that VADER is capable of performing complex long-horizon tasks by asking for help from another robot to clear a table. Our user study showed that VADER is capable of performing complex long-horizon tasks by asking for help from a human to clear a path. We gathered feedback from people (N=19) about the performance of the VADER performance vs. a robot that did not ask for help. https://google-vader.github.io/
翻译:当前,机器人能够利用大语言模型丰富的世界知识,将简单的行为技能串联起来以执行长时程任务。然而,在长时程任务执行过程中,机器人常因基础技能失败或动态环境变化而中断。我们提出了VADER,这是一个集规划、执行、检测于一体的框架,并将“寻求帮助”作为一种新技能,使机器人能够在人类或其他机器人的帮助下恢复并完成长时程任务。VADER利用视觉问答模块来检测视觉可供性并识别执行错误。随后,它为语言模型规划器生成提示,由该规划器决定何时向其他机器人或人类寻求帮助,以从长时程任务执行错误中恢复。我们通过两项长时程机器人任务展示了VADER的有效性。我们的初步研究表明,VADER能够通过向另一台机器人寻求帮助来清理桌面,从而执行复杂的长时程任务。我们的用户研究表明,VADER能够通过向人类寻求帮助来清理路径,从而执行复杂的长时程任务。我们收集了参与者(N=19)关于VADER与不寻求帮助的机器人性能对比的反馈。https://google-vader.github.io/