What is a useful skill hierarchy for an autonomous agent? We propose an answer based on a graphical representation of how the interaction between an agent and its environment may unfold. Our approach uses modularity maximisation as a central organising principle to expose the structure of the interaction graph at multiple levels of abstraction. The result is a collection of skills that operate at varying time scales, organised into a hierarchy, where skills that operate over longer time scales are composed of skills that operate over shorter time scales. The entire skill hierarchy is generated automatically, with no human intervention, including the skills themselves (their behaviour, when they can be called, and when they terminate) as well as the hierarchical dependency structure between them. In a wide range of environments, this approach generates skill hierarchies that are intuitively appealing and that considerably improve the learning performance of the agent.
翻译:自主智能体应具备怎样的有用技能层次结构?我们基于智能体与环境交互过程演化的图表示提出一种方案。该方法将模块度最大化作为核心组织原则,通过多层级抽象揭示交互图的结构特征。由此产生的技能集合按不同时间尺度运作,并组织成层次结构:更长时间尺度运作的技能由更短时间尺度运作的技能组合而成。整个技能层次结构完全自动生成,无需人工干预——不仅包含技能本身(其行为模式、触发条件及终止机制),还包括技能间的层次依赖关系。在多种环境中,该方法生成的技能层次结构既具有直观合理性,又能显著提升智能体的学习性能。