This paper studies an online optimal resource reservation problem in communication networks with job transfers where the goal is to minimize the reservation cost while maintaining the blocking cost under a certain budget limit. To tackle this problem, we propose a novel algorithm based on a randomized exponentially weighted method that encompasses long-term constraints. We then analyze the performance of our algorithm by establishing an upper bound for the associated regret and the cumulative constraint violations. Finally, we present numerical experiments where we compare the performance of our algorithm with those of reinforcement learning where we show that our algorithm surpasses it.
翻译:本文研究了通信网络中具有任务迁移的在线最优资源预留问题,其目标是在维持阻塞成本不超过特定预算限制的同时最小化预留成本。为解决此问题,我们提出了一种基于随机化指数加权方法的新型算法,该方法涵盖了长期约束。随后,我们通过建立相关遗憾值与累计约束违反次数的上界来分析所提算法的性能。最后,我们通过数值实验将所提算法与强化学习的性能进行比较,结果表明我们的算法优于强化学习。