When operating in human environments, robots need to handle complex tasks while both adhering to social norms and accommodating individual preferences. For instance, based on common sense knowledge, a household robot can predict that it should avoid vacuuming during a social gathering, but it may still be uncertain whether it should vacuum before or after having guests. In such cases, integrating common-sense knowledge with human preferences, often conveyed through human explanations, is fundamental yet a challenge for existing systems. In this paper, we introduce GRACE, a novel approach addressing this while generating socially appropriate robot actions. GRACE leverages common sense knowledge from Large Language Models (LLMs), and it integrates this knowledge with human explanations through a generative network architecture. The bidirectional structure of GRACE enables robots to refine and enhance LLM predictions by utilizing human explanations and makes robots capable of generating such explanations for human-specified actions. Our experimental evaluations show that integrating human explanations boosts GRACE's performance, where it outperforms several baselines and provides sensible explanations.
翻译:在人类环境中运行时,机器人不仅需要处理复杂任务,还需同时遵守社会规范并适应用户个人偏好。例如,基于常识知识,家庭机器人可以预测在社交聚会期间应避免吸尘,但它可能仍不确定应在客人到来之前还是之后进行吸尘。在此类场景中,将常识知识与人类偏好(通常通过人类解释传达)相融合至关重要,但对现有系统而言仍具挑战性。本文提出GRACE这一创新方法,在生成社交适宜的机器人行为的同时解决该问题。GRACE利用大语言模型(LLMs)中的常识知识,并通过生成式网络架构将该知识与人类解释相融合。GRACE的双向结构使机器人能够借助人类解释来优化和增强LLM的预测,并使机器人能够为人类指定的行为生成此类解释。我们的实验评估表明,融合人类解释能有效提升GRACE的性能,其表现优于多个基线模型,并能提供合理的解释。