Controlling energy consumption in buildings through demand response (DR) has become increasingly important to reduce global carbon emissions and limit climate change. In this paper, we specifically focus on controlling the heating system of a residential building to optimize its energy consumption while respecting user's thermal comfort. Recent works in this area have mainly focused on either model-based control, e.g., model predictive control (MPC), or model-free reinforcement learning (RL) to implement practical DR algorithms. A specific RL method that recently has achieved impressive success in domains such as board games (go, chess) is Monte Carlo Tree Search (MCTS). Yet, for building control it has remained largely unexplored. Thus, we study MCTS specifically for building demand response. Its natural structure allows a flexible optimization that implicitly integrate exogenous constraints (as opposed, for example, to conventional RL solutions), making MCTS a promising candidate for DR control problems. We demonstrate how to improve MCTS control performance by incorporating a Physics-informed Neural Network (PiNN) model for its underlying thermal state prediction, as opposed to traditional purely data-driven Black-Box approaches. Our MCTS implementation aligned with a PiNN model is able to obtain a 3% increment of the obtained reward compared to a rule-based controller; leading to a 10% cost reduction and 35% reduction on temperature difference with the desired one when applied to an artificial price profile. We further implemented a Deep Learning layer into the Monte Carlo Tree Search technique using a neural network that leads the tree search through more optimal nodes. We then compared this addition with its Vanilla version, showing the improvement in computational cost required.
翻译:通过需求响应控制建筑能耗对于减少全球碳排放和限制气候变化日益重要。本文专门研究如何控制住宅建筑供暖系统,在优化能耗的同时保障用户热舒适度。该领域近期研究主要聚焦于两类方法:基于模型的控制(如模型预测控制)和无模型强化学习来实施实用需求响应算法。蒙特卡洛树搜索作为近期在围棋等棋类游戏领域取得显著成功的特定强化学习方法,在建筑控制领域仍鲜有探索。为此,我们专门研究MCTS在建筑需求响应中的应用。其天然结构允许灵活优化并隐式集成外部约束(与传统强化学习方案相比),使其成为需求响应控制问题的候选方案。我们展示了如何通过融合物理信息神经网络模型进行热力状态预测(而非传统纯数据驱动的黑箱方法)来提升MCTS控制性能。基于PiNN模型的MCTS实现相较基于规则的控制器可获得3%的奖励提升;在应用于人工定价曲线时,可实现10%的成本降低和35%的温差缩小。我们进一步在MCTS中融入深度学习层,通过引导树搜索走向更优节点的神经网络,并与基础版本对比,验证了该方法在计算成本上的改进。