Autonomous and learning systems based on Deep Reinforcement Learning have firmly established themselves as a foundation for approaches to creating resilient and efficient Cyber-Physical Energy Systems. However, most current approaches suffer from two distinct problems: Modern model-free algorithms such as Soft Actor Critic need a high number of samples to learn a meaningful policy, as well as a fallback to ward against concept drifts (e. g., catastrophic forgetting). In this paper, we present the work in progress towards a hybrid agent architecture that combines model-based Deep Reinforcement Learning with imitation learning to overcome both problems.
翻译:基于深度强化学习的自主与学习系统,已成为构建具有韧性与高效的网络-物理能源系统方法的重要基础。然而,当前多数方法面临两个显著问题:现代无模型算法(如软演员-评论家算法)需要大量样本才能学会有意义的策略,同时需具备应对概念漂移(如灾难性遗忘)的备份机制。本文提出了一种混合智能体架构的研究进展,该架构结合基于模型的深度强化学习与模仿学习,以同时解决上述两个问题。