In this paper, we focus on inferring whether the given user command is clear, ambiguous, or infeasible in the context of interactive robotic agents utilizing large language models (LLMs). To tackle this problem, we first present an uncertainty estimation method for LLMs to classify whether the command is certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command is classified as uncertain, we further distinguish it between ambiguous or infeasible commands leveraging LLMs with situational aware context in a zero-shot manner. For ambiguous commands, we disambiguate the command by interacting with users via question generation with LLMs. We believe that proper recognition of the given commands could lead to a decrease in malfunction and undesired actions of the robot, enhancing the reliability of interactive robot agents. We present a dataset for robotic situational awareness, consisting pair of high-level commands, scene descriptions, and labels of command type (i.e., clear, ambiguous, or infeasible). We validate the proposed method on the collected dataset, pick-and-place tabletop simulation. Finally, we demonstrate the proposed approach in real-world human-robot interaction experiments, i.e., handover scenarios.
翻译:本文聚焦于推断给定用户指令在基于大语言模型的交互式机器人代理语境中是否清晰、模糊或不可行。为解决此问题,我们首先提出一种面向大语言模型的不确定性估计方法,用于将指令分类为确定(即清晰)或非确定(即模糊或不可行)。一旦指令被判定为不确定,我们进一步利用具备情境感知上下文的大语言模型,以零样本方式区分模糊指令与不可行指令。针对模糊指令,我们通过大语言模型生成问题与用户交互来实现消歧。我们相信,对给定指令的恰当识别能减少机器人的功能异常与不当行为,从而提升交互式机器人代理的可靠性。为此,我们构建了一个机器人情境感知数据集,包含高层指令、场景描述及指令类型标签(清晰、模糊或不可行)。我们在收集的数据集及桌面拾放仿真任务中验证了所提方法,最后在实际人机交互实验(即传递场景)中进行了方法展示。