This paper studies a dynamic discrete-time queuing model where at every period players get a new job and must send all their jobs to a queue that has a limited capacity. Players have an incentive to send their jobs as late as possible; however if a job does not exit the queue by a fixed deadline, the owner of the job incurs a penalty and this job is sent back to the player and joins the queue at the next period. Therefore, stability, i.e. the boundedness of the number of jobs in the system, is not guaranteed. We show that if players are myopically strategic, then the system is stable when the penalty is high enough. Moreover, if players use a learning algorithm derived from a typical no-regret algorithm (exponential weight), then the system is stable when penalties are greater than a bound that depends on the total number of jobs in the system.
翻译:本文研究了一个动态离散时间排队模型,其中在每个时期,参与者都会获得一个新任务,并必须将所有任务发送至容量有限的队列。参与者有动机尽可能晚地提交任务;然而,如果任务未能在固定截止日期前离开队列,则该任务的所有者将遭受惩罚,且该任务会被退回给参与者,并在下一时期重新加入队列。因此,系统的稳定性(即系统中任务数量的有界性)无法得到保证。我们证明,若参与者采取近视策略,则当惩罚足够高时系统是稳定的。此外,若参与者采用基于典型无悔学习算法(指数权重)的学习算法,则当惩罚大于一个取决于系统任务总数的阈值时,系统是稳定的。