In this paper, we focus on inferring whether the given user command is clear, ambiguous, or infeasible in the context of interactive robotic agents utilizing large language models (LLMs). To tackle this problem, we first present an uncertainty estimation method for LLMs to classify whether the command is certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command is classified as uncertain, we further distinguish it between ambiguous or infeasible commands leveraging LLMs with situational aware context in a zero-shot manner. For ambiguous commands, we disambiguate the command by interacting with users via question generation with LLMs. We believe that proper recognition of the given commands could lead to a decrease in malfunction and undesired actions of the robot, enhancing the reliability of interactive robot agents. We present a dataset for robotic situational awareness, consisting pair of high-level commands, scene descriptions, and labels of command type (i.e., clear, ambiguous, or infeasible). We validate the proposed method on the collected dataset, pick-and-place tabletop simulation. Finally, we demonstrate the proposed approach in real-world human-robot interaction experiments, i.e., handover scenarios.
翻译:本文聚焦于在利用大语言模型(LLM)的交互式机器人智能体场景中,推断给定用户指令是清晰、模糊还是不可行的。为解决此问题,我们首先提出一种基于LLM的不确定性估计方法,用于将指令分类为确定(即清晰)或不确定(即模糊或不可行)。一旦指令被分类为不确定,我们进一步利用具有情境感知上下文的LLM,以零样本方式区分其属于模糊指令还是不可行指令。对于模糊指令,我们通过LLM生成问题与用户交互以实现指令消歧。我们相信,对给定指令的准确识别能够减少机器人的故障与意外行为,从而提升交互式机器人智能体的可靠性。我们提出了一个面向机器人情境感知的数据集,其中包含高级指令、场景描述及指令类型标签(即清晰、模糊或不可行)的配对数据。我们在收集的数据集及桌面拾放模拟任务上验证了所提方法。最后,我们在真实世界的人机交互实验(例如物品递送场景)中展示了所提方法的有效性。