Addressing Imperfect Symmetry: a Novel Symmetry-Learning Actor-Critic Extension

Symmetry, a fundamental concept to understand our environment, often oversimplifies reality from a mathematical perspective. Humans are a prime example, deviating from perfect symmetry in terms of appearance and cognitive biases (e.g. having a dominant hand). Nevertheless, our brain can easily overcome these imperfections and efficiently adapt to symmetrical tasks. The driving motivation behind this work lies in capturing this ability through reinforcement learning. To this end, we introduce Adaptive Symmetry Learning (ASL) $\unicode{x2013}$ a model-minimization actor-critic extension that addresses incomplete or inexact symmetry descriptions by adapting itself during the learning process. ASL consists of a symmetry fitting component and a modular loss function that enforces a common symmetric relation across all states while adapting to the learned policy. The performance of ASL is compared to existing symmetry-enhanced methods in a case study involving a four-legged ant model for multidirectional locomotion tasks. The results demonstrate that ASL is capable of recovering from large perturbations and generalizing knowledge to hidden symmetric states. It achieves comparable or better performance than alternative methods in most scenarios, making it a valuable approach for leveraging model symmetry while compensating for inherent perturbations.

翻译：对称性是理解环境的基本概念,但从数学视角看往往将现实过度简化。人类便是典型例证,在外貌特征和认知偏差(如优势手倾向)上与完美对称性存在偏离。然而,人类大脑能轻易克服这些非完美性并高效适应对称性任务。本研究旨在通过强化学习捕捉该能力,为此提出自适应对称学习(ASL)——一种针对不完整或不精确对称描述的模型最小化Actor-Critic扩展方法,能在学习过程中自动调整。ASL包含对称性拟合组件与模块化损失函数,该损失函数在适应所学策略的同时对所有状态施加统一的对称关系。在涉及四足蚂蚁模型的多方向运动任务案例研究中,将ASL与现有对称增强方法进行性能比较。结果表明,ASL能从较大扰动中恢复并将知识泛化到隐藏对称状态,在多数场景中的表现达到或优于其他方法,成为利用模型对称性同时补偿固有扰动的重要方案。

相关内容