We tackle in this paper an online network resource allocation problem with job transfers. The network is composed of many servers connected by communication links. The system operates in discrete time; at each time slot, the administrator reserves resources at servers for future job requests, and a cost is incurred for the reservations made. Then, after receptions, the jobs may be transferred between the servers to best accommodate the demands. This incurs an additional transport cost. Finally, if a job request cannot be satisfied, there is a violation that engenders a cost to pay for the blocked job. We propose a randomized online algorithm based on the exponentially weighted method. We prove that our algorithm enjoys a sub-linear in time regret, which indicates that the algorithm is adapting and learning from its experiences and is becoming more efficient in its decision-making as it accumulates more data. Moreover, we test the performance of our algorithm on artificial data and compare it against a reinforcement learning method where we show that our proposed method outperforms the latter.
翻译:本文研究了一个包含任务迁移的在线网络资源分配问题。该网络由多个通过通信链路连接的服务器组成。系统以离散时间运行:在每个时隙,管理员为未来任务请求预留服务器资源,并为此产生相应成本。随后,在接收任务后,可在服务器之间迁移任务以最优适配需求,这会产生额外的传输成本。最后,若任务请求无法得到满足,则产生违反约束需为被阻塞任务支付成本。我们提出了一种基于指数加权方法的随机在线算法,并证明该算法具有时间次线性的遗憾值,表明该算法能够从经验中自适应学习,随着数据积累其决策效率持续提升。此外,我们在人工数据集上测试算法性能,并与强化学习方法进行对比,实验结果表明所提方法优于后者。