This paper proposes a slot-based energy storage approach for decision-making in the context of an Off-Grid telecommunication operator. We consider network systems powered by solar panels, where harvest energy is stored in a battery that can also be sold when fully charged. To reflect real-world conditions, we account for non-stationary energy arrivals and service demands that depend on the time of day, as well as the failure states of PV panel. The network operator we model faces two conflicting objectives: maintaining the operation of its infrastructure and selling (or supplying to other networks) surplus energy from fully charged batteries. To address these challenges, we developed a slot-based Markov Decision Process (MDP) model that incorporates positive rewards for energy sales, as well as penalties for energy loss and battery depletion. This slot-based MDP follows a specific structure we have previously proven to be efficient in terms of computational performance and accuracy. From this model, we derive the optimal policy that balances these conflicting objectives and maximizes the average reward function. Additionally, we present results comparing different cities and months, which the operator can consider when deploying its infrastructure to maximize rewards based on location-specific energy availability and seasonal variations.
翻译:本文提出了一种基于时隙的储能决策方法,适用于离网电信运营商的运营场景。我们研究由太阳能电池板供电的网络系统,其中收集的能量存储于电池中,并在电池充满时可对外出售。为反映现实条件,我们考虑了非平稳的能量到达情况、依赖于一天中不同时段的服务需求,以及光伏电池板的故障状态。我们建模的网络运营商面临两个相互冲突的目标:维持其基础设施的持续运行,以及出售(或供应给其他网络)来自充满电电池的剩余能量。为应对这些挑战,我们开发了一个基于时隙的马尔可夫决策过程(MDP)模型,该模型包含了能量出售的正向奖励,以及能量损失和电池耗尽的惩罚。这种基于时隙的MDP遵循我们先前已证明在计算性能和准确性方面均高效的一种特定结构。基于该模型,我们推导出了平衡这些冲突目标并最大化平均奖励函数的最优策略。此外,我们展示了比较不同城市和月份的结果,运营商在部署其基础设施时可以考虑这些结果,以便根据特定地点的能量可用性和季节变化来最大化收益。