We tackle the challenge of learning to charge Electric Vehicles (EVs) with Out-of-Distribution (OOD) data. Traditional scheduling algorithms typically fail to balance near-optimal average performance with worst-case guarantees, particularly with OOD data. Model Predictive Control (MPC) is often too conservative and data-independent, whereas Reinforcement Learning (RL) tends to be overly aggressive and fully trusts the data, hindering their ability to consistently achieve the best-of-both-worlds. To bridge this gap, we introduce a novel OOD-aware scheduling algorithm, denoted OOD-Charging. This algorithm employs a dynamic "awareness radius", which updates in real-time based on the Temporal Difference (TD)-error that reflects the severity of OOD. The OOD-Charging algorithm allows for a more effective balance between consistency and robustness in EV charging schedules, thereby significantly enhancing adaptability and efficiency in real-world charging environments. Our results demonstrate that this approach improves the scheduling reward reliably under real OOD scenarios with remarkable shifts of EV charging behaviors caused by COVID-19 in the Caltech ACN-Data.
翻译:我们致力于解决在分布外数据环境下学习电动汽车充电调度的挑战。传统调度算法通常难以在接近最优的平均性能与最坏情况保证之间取得平衡,尤其是在面对分布外数据时。模型预测控制方法往往过于保守且与数据无关,而强化学习方法则倾向于过度激进并完全信任数据,这阻碍了它们持续实现"两者兼顾"的能力。为弥补这一差距,我们提出了一种新颖的分布外感知调度算法,记为OOD-Charging。该算法采用动态的"感知半径",该半径根据反映分布外严重程度的时间差分误差实时更新。OOD-Charging算法能够在电动汽车充电调度的稳定性和鲁棒性之间实现更有效的平衡,从而显著提升实际充电环境中的适应性和效率。我们的实验结果表明,在加州理工学院ACN数据集中由COVID-19引起的电动汽车充电行为显著变化的真实分布外场景下,该方法能可靠地提升调度收益。