We propose a learning dynamics to model how strategic agents repeatedly play a continuous game while relying on an information platform to learn an unknown payoff-relevant parameter. In each time step, the platform updates a belief estimate of the parameter based on players' strategies and realized payoffs using Bayes's rule. Then, players adopt a generic learning rule to adjust their strategies based on the updated belief. We present results on the convergence of beliefs and strategies and the properties of convergent fixed points of the dynamics. We obtain sufficient and necessary conditions for the existence of globally stable fixed points. We also provide sufficient conditions for the local stability of fixed points. These results provide an approach to analyzing the long-term outcomes that arise from the interplay between Bayesian belief learning and strategy learning in games, and enable us to characterize conditions under which learning leads to a complete information equilibrium.
翻译:我们提出一种学习动态模型,用以描述策略性主体在依赖信息平台学习未知收益相关参数时,如何重复进行连续博弈。在每一步中,平台基于玩家的策略及已实现的收益,运用贝叶斯法则更新对参数的信念估计。随后,玩家采用通用学习规则依据更新后的信念调整其策略。我们给出了关于信念与策略收敛性以及动态收敛不动点性质的研究结果。我们获得了全局稳定不动点存在的充分必要条件,并提供了不动点局部稳定性的充分条件。这些结果提供了一种分析博弈中贝叶斯信念学习与策略学习相互作用所产生的长期结果的方法,并使我们能够刻画学习收敛至完全信息均衡的条件。