We consider three distinct discrete-time models of learning and evolution in games: a biological model based on intra-species selective pressure, the dynamics induced by pairwise proportional imitation, and the exponential / multiplicative weights algorithm for online learning. Even though these models share the same continuous-time limit - the replicator dynamics - we show that second-order effects play a crucial role and may lead to drastically different behaviors in each model, even in very simple, symmetric two by two games. Specifically, we study the resulting discrete-time dynamics in a class of parametrized congestion games, and we show that (i) in the biological model of intra-species competition, the dynamics remain convergent for any parameter value; (ii) the dynamics of pairwise proportional imitation for different equilibrium configurations exhibit an entire range of behaviors for large step size (stability, instability, and even Li-Yorke chaos); while (iii) for the exponential / multiplicative weights (EW) algorithm increasing step size will (almost) inevitably lead to chaos (again, in the formal, Li-Yorke sense). This divergence of behaviors comes in stark contrast to the globally convergent behavior of the replicator dynamics, and serves to delineate the extent to which the replicator dynamics provide a useful predictor for the long-run behavior of their discrete-time origins.
翻译:我们研究了博弈中学习与进化的三种不同离散时间模型:基于种内选择压力的生物模型、成对比例模仿所诱导的动力学,以及在线学习的指数/乘法权重算法。尽管这些模型共享相同的连续时间极限——复制动力学,但我们表明二阶效应起着关键作用,并可能导致每个模型中截然不同的行为,即使在非常简单的对称两人两策略博弈中也是如此。具体而言,我们在一类参数化拥塞博弈中研究了由此产生的离散时间动力学,并证明了:(i) 在种内竞争的生物模型中,动力学对于任何参数值都保持收敛;(ii) 不同均衡构型下的成对比例模仿动力学在大步长下展现出完整的行为谱(稳定性、不稳定性,甚至李-约克混沌);而(iii) 对于指数/乘法权重(EW)算法,增加步长将(几乎)不可避免地导致混沌(同样,在形式上的李-约克意义上)。这些行为的发散与复制动力学的全局收敛行为形成鲜明对比,并界定了复制动力学作为其离散时间起源长期行为有用预测指标的程度。