We study challenges using reinforcement learning in controlling energy systems, where apart from performance requirements, one has additional safety requirements such as avoiding blackouts. We detail how these safety requirements in real-time temporal logic can be strengthened via discretization into linear temporal logic (LTL), such that the satisfaction of the LTL formulae implies the satisfaction of the original safety requirements. The discretization enables advanced engineering methods such as synthesizing shields for safe reinforcement learning as well as formal verification, where for statistical model checking, the probabilistic guarantee acquired by LTL model checking forms a lower bound for the satisfaction of the original real-time safety requirements.
翻译:我们研究了强化学习在能源系统控制中面临的挑战,除了性能要求外,还需满足避免断电等附加安全要求。详细阐述了如何通过离散化将实时时序逻辑中的安全要求强化为线性时序逻辑(LTL),使得LTL公式的满足性蕴含原始安全要求的满足性。该离散化方法可支持先进工程手段,例如为安全强化学习合成防护机制以及形式化验证——在统计模型检验中,通过LTL模型检验获得的概率性保证构成了原始实时安全要求满足性的下界。