We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner, where "decentralized" means that players make decisions individually without the influence of a central platform, and "uncoordinated" means that players do not need to synchronize their decisions using pre-specified rules. First, we provide a game formulation for this problem with known preferences, where the set of pure Nash equilibria (NE) coincides with the set of stable matchings, and mixed NE can be rounded to a stable matching. Then, we show that for hierarchical markets, applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. Moreover, we show that EXP converges locally and exponentially fast to a stable matching in general markets. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability. Finally, we provide stronger feedback conditions under which it is possible to drive the market faster toward an approximate stable matching. Our proposed game-theoretic framework bridges the discrete problem of learning stable matchings with the problem of learning NE in continuous-action games.
翻译:本文研究在未知偏好条件下以去中心化与无协调方式学习稳定匹配的问题,其中“去中心化”指参与者通过个体决策进行学习,不受中心化平台影响;“无协调”指参与者无需依赖预设规则同步决策。首先,我们在已知偏好条件下构建该问题的博弈模型,证明其纯纳什均衡集合与稳定匹配集合完全重合,且混合纳什均衡可通过舍入操作转化为稳定匹配。随后,针对层次化市场结构,我们证明将指数权重学习算法应用于稳定匹配博弈时,能够以完全去中心化且无协调的方式实现对数级遗憾。进一步地,我们证明在一般性市场中,指数权重算法能以局部指数级收敛速度逼近稳定匹配。我们还提出另一种去中心化无协调学习算法,该算法能以任意高概率全局收敛至稳定匹配。最后,我们给出更强的反馈条件,证明在该条件下可加速市场收敛至近似稳定匹配。本文提出的博弈论框架将离散空间的稳定匹配学习问题与连续动作博弈中的纳什均衡学习问题建立了理论联系。