In this paper, we focus on inferring whether the given user command is clear, ambiguous, or infeasible in the context of interactive robotic agents utilizing large language models (LLMs). To tackle this problem, we first present an uncertainty estimation method for LLMs to classify whether the command is certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command is classified as uncertain, we further distinguish it between ambiguous or infeasible commands leveraging LLMs with situational aware context in a zero-shot manner. For ambiguous commands, we disambiguate the command by interacting with users via question generation with LLMs. We believe that proper recognition of the given commands could lead to a decrease in malfunction and undesired actions of the robot, enhancing the reliability of interactive robot agents. We present a dataset for robotic situational awareness, consisting pair of high-level commands, scene descriptions, and labels of command type (i.e., clear, ambiguous, or infeasible). We validate the proposed method on the collected dataset, pick-and-place tabletop simulation. Finally, we demonstrate the proposed approach in real-world human-robot interaction experiments, i.e., handover scenarios.
翻译:本文聚焦于利用大型语言模型(LLMs)的交互式机器人代理场景中,推断给定用户指令是清晰、模糊还是不可行的问题。为应对该挑战,我们首先提出一种针对LLMs的不确定性估计方法,用于分类指令是确定的(即清晰)还是非确定的(即模糊或不可行)。当指令被判定为不确定时,我们进一步利用LLMs结合情境感知上下文,以零样本方式将其区分为模糊指令或不可行指令。针对模糊指令,我们通过LLMs生成问题与用户交互以消解歧义。我们认为,对给定指令的合理识别可减少机器人故障与非期望行为,从而提升交互式机器人代理的可靠性。我们构建了一个面向机器人情境感知的数据集,包含高层指令、场景描述及指令类型标签(即清晰、模糊或不可行)。通过在所收集数据集及拾放桌面仿真环境中的验证实验,我们评估了所提方法。最终,我们在真实人机交互实验(即物品交接场景)中演示了该方法的有效性。