We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher aims at maximizing the overall quality-of-service by balancing the load through a simple threshold policy. We demonstrate that such a policy is optimal on the fluid and diffusion scales, while only involving a small communication overhead, which is crucial for large-scale deployments. In order to set the threshold optimally, it is important, however, to learn the load of the system, which may be unknown. For that purpose, we design a control rule for tuning the threshold in an online manner. We derive conditions which guarantee that this adaptive threshold settles at the optimal value, along with estimates for the time until this happens. In addition, we provide numerical experiments which support the theoretical results and further indicate that our policy copes effectively with time-varying demand patterns.
翻译:我们考虑一个大规模服务系统,其中到达的任务必须即时分配到多个并行服务器池之一。用户感知的性能随并发任务数量的增加而下降,调度器旨在通过简单的阈值策略平衡负载,从而最大化整体服务质量。我们证明,该策略在流体和扩散尺度上是最优的,且仅需少量通信开销,这对大规模部署至关重要。然而,为了最优地设置阈值,需要学习可能未知的系统负载。为此,我们设计了一种在线调节阈值的控制规则。我们推导了保证该自适应阈值收敛至最优值的条件,并给出了收敛时间估计。此外,我们通过数值实验支持理论结果,并进一步表明该策略能有效应对时变需求模式。