Online tuning of real-world plants is a complex optimisation problem that continues to require manual intervention by experienced human operators. Autonomous tuning is a rapidly expanding field of research, where learning-based methods, such as Reinforcement Learning-trained Optimisation (RLO) and Bayesian optimisation (BO), hold great promise for achieving outstanding plant performance and reducing tuning times. Which algorithm to choose in different scenarios, however, remains an open question. Here we present a comparative study using a routine task in a real particle accelerator as an example, showing that RLO generally outperforms BO, but is not always the best choice. Based on the study's results, we provide a clear set of criteria to guide the choice of algorithm for a given tuning task. These can ease the adoption of learning-based autonomous tuning solutions to the operation of complex real-world plants, ultimately improving the availability and pushing the limits of operability of these facilities, thereby enabling scientific and engineering advancements.
翻译:实际工业设备的在线调优是一个复杂的优化问题,目前仍需经验丰富的操作人员手动干预。自主调优是一个快速发展的研究领域,其中基于学习的方法(如强化学习训练优化算法RLO和贝叶斯优化BO)在实现卓越设备性能及缩短调优时间方面展现出巨大潜力。然而,在不同场景下如何选择适用算法仍是一个未解难题。本文以某真实粒子加速器的常规操作任务为例开展对比研究,结果表明RLO在整体性能上优于BO,但并非始终是最优选择。基于研究结果,我们提出了一套明确的评判准则,用于指导特定调优任务的算法选择。该准则可推动基于学习的自主调优方案在复杂工业设备运维中的实际部署,最终提升这些设施的可用性并拓展其运行极限,从而促进科学与工程领域的进步。