The Kelly or proportional allocation mechanism is a simple and efficient auction-based scheme that distributes an infinitely divisible resource proportionally to the agents bids. When agents are aware of the allocation rule, their interactions form a game extensively studied in the literature. This paper examines the less explored repeated Kelly game, focusing mainly on utilities that are logarithmic in the allocated resource fraction. We first derive this logarithmic form from fairness-throughput trade-offs in wireless network slicing, and then prove that the induced stage game admits a unique Nash equilibrium NE. For the repeated play, we prove convergence to this NE under three behavioral models: (i) all agents use Online Gradient Descent (OGD), (ii) all agents use Dual Averaging with a quadratic regularizer (DAQ) (a variant of the Follow-the-Regularized leader algorithm), and (iii) all agents play myopic best responses (BR). Our convergence results hold even when agents use personalized learning rates in OGD and DAQ (e.g., tuned to optimize individual regret bounds), and they extend to a broader class of utilities that meet a certain sufficient condition. Finally, we complement our theoretical results with extensive simulations of the repeated Kelly game under several behavioral models, comparing them in terms of convergence speed to the NE, and per-agent time-average utility. The results suggest that BR achieves the fastest convergence and the highest time-average utility, and that convergence to the stage-game NE may fail under heterogeneous update rules.
翻译:凯利或比例分配机制是一种简单高效的基于拍卖的方案,根据竞标者出价比例分配无限可分的资源。当竞标者了解分配规则时,其互动形成博弈,这一博弈已在文献中得到广泛研究。本文探讨了较少被研究的重复凯利博弈,主要关注对分配资源份额取对数的效用函数。我们首先从无线网络切片中的公平性与吞吐量权衡中推导出这种对数形式,然后证明诱导的阶段博弈存在唯一的纳什均衡。对于重复博弈,我们在三种行为模型下证明收敛到该均衡:(i) 所有代理使用在线梯度下降 (OGD),(ii) 所有代理使用带二次正则项的对偶平均 (DAQ)(一种跟随正则化领导算法变体),以及 (iii) 所有代理采取短视最优反应 (BR)。我们的收敛结果在代理使用个性化学习率(例如,针对优化个体遗憾界而调整)时依然成立,并扩展到满足特定充分条件的更广泛效用函数类别。最后,我们用几种行为模型对重复凯利博弈进行大量仿真,补充理论结果,在收敛到纳什均衡的速度和每个代理的时间平均效用方面进行比较。结果表明,BR 实现了最快的收敛和最高的时间平均效用,且在不同更新规则下可能无法收敛到阶段博弈的纳什均衡。