Mathematical and computational tools have proven to be reliable in decision-making processes. In recent times, in particular, machine learning-based methods are becoming increasingly popular as advanced support tools. When dealing with control problems, reinforcement learning has been applied to decision-making in several applications, most notably in games. The success of these methods in finding solutions to complex problems motivates the exploration of new areas where they can be employed to overcome current difficulties. In this paper, we explore the use of automatic control strategies to initial boundary value problems in thermal and disease transport. Specifically, in this work, we adapt an existing reinforcement learning algorithm using a stochastic policy gradient method and we introduce two novel reward functions to drive the flow of the transported field. The new model-based framework exploits the interactions between a reaction-diffusion model and the modified agent. The results show that certain controls can be implemented successfully in these applications, although model simplifications had to be assumed.
翻译:数学与计算工具已被证明在决策过程中具有可靠性。尤其是近年来,基于机器学习的方法作为高级辅助工具日益流行。在处理控制问题时,强化学习已应用于多种场景的决策过程,其中最引人注目的是游戏领域。这些方法在解决复杂问题上的成功,激励我们探索其可被应用以克服当前困难的新领域。本文探讨了将自动控制策略应用于热传导与疾病传播的初边值问题。具体而言,我们采用随机策略梯度方法对现有强化学习算法进行适配,并引入两种新型奖励函数以驱动传输场的流动。该新型基于模型的框架利用了反应-扩散模型与改进智能体之间的交互作用。结果表明,尽管必须假设模型简化,特定控制方法仍可在这些应用中成功实施。