Some actions must be executed in different ways depending on the context. For example, wiping away marker requires vigorous force while wiping away almonds requires more gentle force. In this paper we provide a model where an agent learns which manner of action execution to use in which context, drawing on evidence from trial and error and verbal corrections when it makes a mistake (e.g., ``no, gently''). The learner starts out with a domain model that lacks the concepts denoted by the words in the teacher's feedback; both the words describing the context (e.g., marker) and the adverbs like ``gently''. We show that through the the semantics of coherence, our agent can perform the symbol grounding that's necessary for exploiting the teacher's feedback so as to solve its domain-level planning problem: to perform its actions in the current context in the right way.
翻译:某些动作必须根据上下文以不同方式执行。例如,擦除记号笔需要用力,而擦除杏仁则需要更轻柔的力度。本文提出一个模型,使智能体能够从试错和口头纠正(如“不,轻一点”)中学习在何种上下文中采用何种动作执行方式。学习者的初始领域模型缺乏教师反馈中词汇所表达的概念,既包括描述上下文的词语(如“记号笔”),也包括“轻柔地”等副词。我们证明,通过连贯性语义,智能体能够执行符号接地,从而利用教师反馈来解决其领域级规划问题:在当下上下文中以正确方式执行动作。