For the purpose of inspecting power plants, autonomous robots can be built using reinforcement learning techniques. The method replicates the environment and employs a simple reinforcement learning (RL) algorithm. This strategy might be applied in several sectors, including the electricity generation sector. A pre-trained model with perception, planning, and action is suggested by the research. To address optimization problems, such as the Unmanned Aerial Vehicle (UAV) navigation problem, Deep Q-network (DQN), a reinforcement learning-based framework that Deepmind launched in 2015, incorporates both deep learning and Q-learning. To overcome problems with current procedures, the research proposes a power plant inspection system incorporating UAV autonomous navigation and DQN reinforcement learning. These training processes set reward functions with reference to states and consider both internal and external effect factors, which distinguishes them from other reinforcement learning training techniques now in use. The key components of the reinforcement learning segment of the technique, for instance, introduce states such as the simulation of a wind field, the battery charge level of an unmanned aerial vehicle, the height the UAV reached, etc. The trained model makes it more likely that the inspection strategy will be applied in practice by enabling the UAV to move around on its own in difficult environments. The average score of the model converges to 9,000. The trained model allowed the UAV to make the fewest number of rotations necessary to go to the target point.
翻译:为完成电厂巡检任务,可基于强化学习技术构建自主机器人。该方法通过复现环境并采用简单强化学习算法,可应用于发电等多个行业领域。研究提出一种包含感知、规划与执行功能的预训练模型。为求解无人机导航这类优化问题,深度强化学习框架——深度Q网络将深度学习与Q学习相结合,该框架由Deepmind于2015年提出。本研究提出一种融合无人机自主导航与DQN强化学习的电厂巡检系统,以解决现有流程中的不足。与传统强化学习训练方法不同,该系统通过参照状态设定奖励函数,并综合考虑内外部影响因素。例如,该技术强化学习部分的关键要素包括风场模拟、无人机电池电量、飞行高度等状态变量。训练后的模型使无人机能够在复杂环境中自主移动,有效提升了巡检策略的实际应用可行性。模型平均得分收敛至9000,训练完成的无人机能以最少旋转次数抵达目标点。