Stability is a basic requirement when studying the behavior of dynamical systems. However, stabilizing dynamical systems via reinforcement learning is challenging because only little data can be collected over short time horizons before instabilities are triggered and data become meaningless. This work introduces a reinforcement learning approach that is formulated over latent manifolds of unstable dynamics so that stabilizing policies can be trained from few data samples. The unstable manifolds are minimal in the sense that they contain the lowest dimensional dynamics that are necessary for learning policies that guarantee stabilization. This is in stark contrast to generic latent manifolds that aim to approximate all -- stable and unstable -- system dynamics and thus are higher dimensional and often require higher amounts of data. Experiments demonstrate that the proposed approach stabilizes even complex physical systems from few data samples for which other methods that operate either directly in the system state space or on generic latent manifolds fail.
翻译:稳定性是研究动力系统行为的基本要求。然而,通过强化学习实现动力系统镇定具有挑战性,因为在触发不稳定性导致数据失效前,只能在短时间范围内收集少量数据。本文提出一种在非稳定动力学隐流形上构建的强化学习方法,使得镇定策略能够通过少量数据样本进行训练。这些不稳定流形具有最小性:它们包含了保证系统镇定所需的最低维动力学特征。这与通用隐流形形成鲜明对比——后者旨在逼近所有(稳定与非稳定)系统动力学,因而维度更高且通常需要更多数据。实验表明,所提方法仅需少量数据样本即可实现复杂物理系统的镇定,而直接在系统状态空间或通用隐流形上操作的其他方法均无法达成此目标。