We propose a novel Deep Reinforcement Learning (DRL) architecture for sequential decision processes under uncertainty, as encountered in inspection and maintenance (I&M) planning. Unlike other DRL algorithms for (I&M) planning, the proposed +RQN architecture dispenses with computing the belief state and directly handles erroneous observations instead. We apply the algorithm to a basic I&M planning problem for a one-component system subject to deterioration. In addition, we investigate the performance of Monte Carlo tree search for the I&M problem and compare it to the +RQN. The comparison includes a statistical analysis of the two methods' resulting policies, as well as their visualization in the belief space.
翻译:我们提出了一种新颖的深度强化学习(DRL)架构,用于处理不确定性下的序贯决策过程——此类问题常见于检测与维护(I&M)规划中。与现有面向I&M规划的DRL算法不同,所提出的+RQN架构免除了信念状态的计算,直接处理错误观测值。我们将该算法应用于一个单部件系统退化环境下的基础I&M规划问题。同时,我们探究了蒙特卡洛树搜索在I&M问题中的性能,并与+RQN方法进行对比。该对比包括两种方法所生成策略的统计分析及其在信念空间中的可视化。