This paper presents the design and implementation of data-driven optimal derivative feedback controllers for an active magnetic levitation system. A direct, model-free control design method based on the reinforcement learning framework is compared with an indirect optimal control design derived from a numerically identified mathematical model of the system. For the direct model-free approach, a policy iteration procedure is proposed, which adds an iteration layer called the epoch loop to gather multiple sets of process data, providing a more diverse dataset and helping reduce learning biases. This direct control design method is evaluated against a comparable optimal control solution designed from a plant model obtained through the combined Dynamic Mode Decomposition with Control (DMDc) and Prediction Error Minimization (PEM) system identification. Results show that while both controllers can stabilize and improve the performance of the magnetic levitation system when compared to controllers designed from a nominal model, the direct model-free approach consistently outperforms the indirect solution when multiple epochs are allowed. The iterative refinement of the optimal control law over the epoch loop provides the direct approach a clear advantage over the indirect method, which relies on a single set of system data to determine the identified model and control.
翻译:本文介绍了主动磁悬浮系统数据驱动最优微分反馈控制器的设计与实现。研究比较了基于强化学习框架的直接、无模型控制设计方法,与从系统数值辨识数学模型导出的间接最优控制设计。对于直接无模型方法,本文提出了一种策略迭代过程,该过程增加了一个称为"周期循环"的迭代层,以收集多组过程数据,从而提供更多样化的数据集并有助于减少学习偏差。这种直接控制设计方法与通过结合控制动态模态分解(DMDc)和预测误差最小化(PEM)系统辨识获得的被控对象模型所设计的可比最优控制解进行了对比评估。结果表明,与基于标称模型设计的控制器相比,两种控制器都能稳定并提升磁悬浮系统的性能;但当允许多个周期运行时,直接无模型方法始终优于间接解决方案。最优控制律在周期循环中的迭代优化使直接方法相比间接方法具有明显优势,后者仅依赖单组系统数据来确定辨识模型和控制律。