We focus on reinforcement learning (RL) in relational problems that are naturally defined in terms of objects, their relations, and object-centric actions. These problems are characterized by variable state and action spaces, and finding a fixed-length representation, required by most existing RL methods, is difficult, if not impossible. We present a deep RL framework based on graph neural networks and auto-regressive policy decomposition that naturally works with these problems and is completely domain-independent. We demonstrate the framework's broad applicability in three distinct domains and show impressive zero-shot generalization over different problem sizes.
翻译:我们聚焦于以对象、对象间关系及面向对象的动作为自然定义的关系型问题中的强化学习(RL)。这类问题具有可变状态空间与动作空间的特征,而现有大多数RL方法所要求的固定长度表示即使并非不可能,也极难构建。我们提出了一种基于图神经网络与自回归策略分解的深度强化学习框架,该框架能够天然适配这类问题,且完全与领域无关。我们在三个不同领域验证了该框架的广泛适用性,并在不同问题规模上展现出令人印象深刻的零样本泛化能力。