Signal Temporal Logic (STL) is a powerful framework for describing the complex temporal and logical behaviour of the dynamical system. Numerous studies have attempted to employ reinforcement learning to learn a controller that enforces STL specifications; however, they have been unable to effectively tackle the challenges of ensuring robust satisfaction in continuous state space and maintaining tractability. In this paper, leveraging the concept of funnel functions, we propose a tractable reinforcement learning algorithm to learn a time-dependent policy for robust satisfaction of STL specification in continuous state space. We demonstrate the utility of our approach on several STL tasks using different environments.
翻译:信号时序逻辑(STL)是描述动态系统复杂时间与逻辑行为的强大框架。已有大量研究尝试利用强化学习来学习满足STL规范的控制器,但这些方法未能有效解决在连续状态空间中确保鲁棒满足性及保持可解性的挑战。本文利用漏斗函数的概念,提出了一种可解的强化学习算法,用于学习面向连续状态空间中STL规范鲁棒满足的时间依赖策略。我们通过不同环境下的多项STL任务验证了该方法的有效性。