Steering control of payoff-maximizing players in adaptive learning dynamics

Evolutionary game theory provides a mathematical foundation for cross-disciplinary fertilization, especially for integrating ideas from artificial intelligence and game theory. Such integration offers a transparent and rigorous approach to complex decision-making problems in a variety of important contexts, ranging from evolutionary computation to machine behavior. Despite the astronomically huge individual behavioral strategy space for interactions in the iterated Prisoner's Dilemma (IPD) games, the so-called Zero-Determinant (ZD) strategies is a set of rather simple memory-one strategies yet can unilaterally set a linear payoff relationship between themselves and their opponent. Although the witting of ZD strategies gives players an upper hand in the IPD games, we find and characterize unbending strategies that can force ZD players to be fair in their own interest. Moreover, our analysis reveals the ubiquity of unbending properties in common IPD strategies which are previously overlooked. In this work, we demonstrate the important steering role of unbending strategies in fostering fairness and cooperation in pairwise interactions. Our results will help bring a new perspective by means of combining game theory and multi-agent learning systems for optimizing winning strategies that are robust to noises, errors, and deceptions in non-zero-sum games.

翻译：演化博弈论为跨学科融合提供了数学基础，尤其促进了人工智能与博弈论思想的整合。这种整合为从演化计算到机器行为等众多重要领域中的复杂决策问题提供了透明且严谨的解决方案。尽管迭代囚徒困境（IPD）博弈中个体行为策略空间极为庞大，但所谓的零行列式（ZD）策略作为一类较为简单的记忆-策略，能够单方面设定自身与对手之间的线性收益关系。虽然ZD策略的运用使参与者在IPD博弈中占据优势，但我们发现并刻画了"不屈策略"（unbending strategies），这类策略能迫使ZD参与者出于自身利益而保持公平。此外，我们的分析揭示了常见IPD策略中普遍存在的"不屈"特性，这些特性此前未被充分认识。本研究证明了"不屈策略"在促进成对交互中的公平与合作方面具有重要的转向引导作用。我们的研究结果将有助于通过博弈论与多智能体学习系统的结合，为优化非零和博弈中对噪声、错误和欺骗具有鲁棒性的致胜策略提供新视角。

相关内容

自适应学习

关注 10

自适应学习，也被称为自适应教学，是使用计算机算法来协调与学习者的互动，并提供定制学习资源和学习活动来解决每个学习者的独特需求的教育方法。在专业的学习情境，个人可以“试验出”一些训练方式，以确保教学内容的更新。根据学生的学习需要，计算机生成适应其特点的教育材料，包括他们对问题的回答和完成的任务和经验。该技术涵盖了各个研究领域和它们的衍生，包括计算机科学、人工智能、心理测验、教育学、心理学和脑科学。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日