The discipline of automatic control is making increased use of concepts that originate from the domain of machine learning. Herein, reinforcement learning (RL) takes an elevated role, as it is inherently designed for sequential decision making, and can be applied to optimal control problems without the need for a plant system model. To advance education of control engineers and operators in this field, this contribution targets an RL framework that can be applied to educational hardware provided by the Lucas-N\"ulle company. Specifically, the goal of inverted pendulum control is pursued by means of RL, including both, swing-up and stabilization within a single holistic design approach. Herein, the actual learning is enabled by separating corresponding computations from the real-time control computer and outsourcing them to a different hardware. This distributed architecture, however, necessitates communication of the involved components, which is realized via CAN bus. The experimental proof of concept is presented with an applied safeguarding algorithm that prevents the plant from being operated harmfully during the trial-and-error training phase.
翻译:自动控制学科正越来越多地采用源自机器学习领域的概念。其中,强化学习(RL)因其本质上为序列决策而设计,且无需被控对象系统模型即可应用于最优控制问题,故而占据重要地位。为推进控制工程师和操作人员在该领域的教育,本研究旨在建立一个可应用于Lucas-Nülle公司提供的教学硬件设备的强化学习框架。具体而言,通过强化学习方法实现倒立摆控制目标,包括摆起与稳定两个阶段,并采用单一整体设计策略。实际学习过程通过将相关计算从实时控制计算机分离并外置至不同硬件实现。然而,这种分布式架构需要各组件间的通信,本研究通过CAN总线实现该通信。实验概念验证展示了一种应用安全保护算法,该算法在试错训练阶段防止被控对象受到有害操作。