Effective residential appliance scheduling is crucial for sustainable living. While multi-objective reinforcement learning (MORL) has proven effective in balancing user preferences in appliance scheduling, traditional MORL struggles with limited data in non-stationary residential settings characterized by renewable generation variations. Significant context shifts that can invalidate previously learned policies. To address these challenges, we extend state-of-the-art MORL algorithms with the meta-learning paradigm, enabling rapid, few-shot adaptation to shifting contexts. Additionally, we employ an auto-encoder (AE)-based unsupervised method to detect environment context changes. We have also developed a residential energy environment to evaluate our method using real-world data from London residential settings. This study not only assesses the application of MORL in residential appliance scheduling but also underscores the effectiveness of meta-learning in energy management. Our top-performing method significantly surpasses the best baseline, while the trained model saves 3.28% on electricity bills, a 2.74% increase in user comfort, and a 5.9% improvement in expected utility. Additionally, it reduces the sparsity of solutions by 62.44%. Remarkably, these gains were accomplished using 96.71% less training data and 61.1% fewer training steps.
翻译:有效的住宅电器调度对于可持续生活至关重要。虽然多目标强化学习(MORL)在平衡电器调度中的用户偏好方面已被证明是有效的,但传统的MORL在非平稳的住宅环境中,面对以可再生能源发电波动为特征的有限数据时,表现不佳。显著的环境上下文变化可能使先前习得的策略失效。为应对这些挑战,我们将最先进的MORL算法与元学习范式相结合,使其能够对变化的上下文进行快速、小样本的适应。此外,我们采用了一种基于自动编码器(AE)的无监督方法来检测环境上下文变化。我们还开发了一个住宅能源环境,利用来自伦敦住宅环境的真实数据来评估我们的方法。本研究不仅评估了MORL在住宅电器调度中的应用,还强调了元学习在能源管理中的有效性。我们表现最佳的方法显著超越了最佳基线,同时训练后的模型节省了3.28%的电费,用户舒适度提升了2.74%,预期效用提高了5.9%。此外,它将解的稀疏性降低了62.44%。值得注意的是,这些成果是在使用训练数据减少96.71%且训练步数减少61.1%的情况下实现的。