Ambiguities are common in human-robot interaction, especially when a robot follows user instructions in a large collocated space. For instance, when the user asks the robot to find an object in a home environment, the object might be in several places depending on its varying semantic properties (e.g., a bowl can be in the kitchen cabinet or on the dining room table, depending on whether it is clean/dirty, full/empty and the other objects around it). Previous works on object semantics have predicted such relationships using one shot-inferences which are likely to fail for ambiguous or partially understood instructions. This paper focuses on this gap and suggests a semantically-driven disambiguation approach by utilizing follow-up clarifications to handle such uncertainties. To achieve this, we first obtain semantic knowledge embeddings, and then these embeddings are used to generate clarifying questions by following an iterative process. The evaluation of our method shows that our approach is model agnostic, i.e., applicable to different semantic embedding models, and follow-up clarifications improve the performance regardless of the embedding model. Additionally, our ablation studies show the significance of informative clarifications and iterative predictions to enhance system accuracies.
翻译:在人机交互中,歧义现象普遍存在,尤其当机器人在大型共址空间执行用户指令时。例如,当用户要求机器人在家庭环境中寻找某物体时,该物体可能因语义属性的差异而存在于多个位置(例如,碗可能存放于厨房橱柜或放置在餐厅餐桌上,具体位置取决于其清洁/脏污状态、盛满/空闲状态以及周边物体的配置)。现有关于物体语义的研究多采用单次推理预测此类关联关系,但面对模糊或部分理解的指令时,此类方法往往失效。本文针对这一研究空白,提出一种语义驱动的消歧方法,通过利用后续澄清机制来处理此类不确定性。为实现这一目标,我们首先获取语义知识嵌入表示,随后通过迭代过程利用这些嵌入生成澄清性问题。实验评估表明:我们的方法具有模型无关性,即适用于不同的语义嵌入模型,且无论采用何种嵌入模型,后续澄清机制均能提升系统性能。此外,消融实验证明了信息性澄清与迭代预测对于提升系统准确率的重要作用。