Resolving social dilemmas with minimal reward transfer - 专知论文

会员服务 ·

0

Agent · GROUP · 可理解性 · motivation · 成比例 ·

2024 年 3 月 21 日

Resolving social dilemmas with minimal reward transfer

翻译：基于最小奖励转移解决社会困境

Richard Willis,Yali Du,Joel Z Leibo,Michael Luck

from arxiv, 34 pages, 13 tables, submitted to the Journal of Autonomous Agents and Multi-Agent Systems: Special Issue on Citizen-Centric AI Systems

Multi-agent cooperation is an important topic, and is particularly challenging in mixed-motive situations where it does not pay to be nice to others. Consequently, self-interested agents often avoid collective behaviour, resulting in suboptimal outcomes for the group. In response, in this paper we introduce a metric to quantify the disparity between what is rational for individual agents and what is rational for the group, which we call the general self-interest level. This metric represents the maximum proportion of individual rewards that all agents can retain while ensuring that achieving social welfare optimum becomes a dominant strategy. By aligning the individual and group incentives, rational agents acting to maximise their own reward will simultaneously maximise the collective reward. As agents transfer their rewards to motivate others to consider their welfare, we diverge from traditional concepts of altruism or prosocial behaviours. The general self-interest level is a property of a game that is useful for assessing the propensity of players to cooperate and understanding how features of a game impact this. We illustrate the effectiveness of our method on several novel games representations of social dilemmas with arbitrary numbers of players.

翻译：多智能体合作是一个重要课题，在混合动机情境中尤为具有挑战性——此时善待他人并无益处。因此，自私的智能体常常回避集体行为，导致群体结果次优。为此，本文提出一个量化个体理性与群体理性差距的度量指标，称为一般自利水平。该指标表示所有智能体在确保实现社会福利最优成为占优策略的条件下，可保留的个体奖励最大比例。通过协调个体与群体激励，理性智能体在最大化自身奖励的同时，将同步最大化集体奖励。由于智能体通过转移奖励来促使他人考虑自身福祉，我们突破了传统的利他主义或亲社会行为概念。一般自利水平是博弈的一种属性，可用于评估参与者合作倾向并理解博弈特征对合作的影响。我们在多个包含任意数量参与者的新型社会困境博弈表征中验证了该方法的有效性。

0

相关内容

Agent

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

学习自然语言处理路线图

学习自然语言处理路线图

专知会员服务

140+阅读 · 2019年9月24日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

AI研习社

33+阅读 · 2019年4月18日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

开放知识图谱

28+阅读 · 2018年6月11日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

随机二阶锥互补问题理论与算法研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

考虑一般约束条件下的消费投资决策模型研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

科学合作网络的不连通问题研究

国家自然科学基金

3+阅读 · 2014年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

Simulating the economic impact of rationality through reinforcement learning and agent-based modelling

Arxiv

0+阅读 · 2024年5月3日

Compressing neural network by tensor network with exponentially fewer variational parameters

Arxiv

0+阅读 · 2024年5月3日

How A/B testing changes the dynamics of information spreading on a social network

Arxiv

0+阅读 · 2024年5月2日

Quantum cryptographic protocols with dual messaging system via 2D alternate quantum walks and genuine single particle entangled states

Quantum cryptographic protocols with dual messaging system via 2D alternate quantum walks and genuine single particle entangled states

Arxiv

0+阅读 · 2024年5月1日

How founder motivations, goals, and actions influence early trajectories of online communities

Arxiv

0+阅读 · 2024年5月1日

Metric geometry of the privacy-utility tradeoff

Arxiv

0+阅读 · 2024年5月1日

Correcting misinformation on social media with a large language model

Arxiv

0+阅读 · 2024年4月30日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Arxiv

102+阅读 · 2022年5月11日

How to represent part-whole hierarchies in a neural network

Arxiv

13+阅读 · 2021年2月25日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

VIP会员

文章信息

相关主题

最新内容

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

1+阅读 · 48分钟前

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

1+阅读 · 今天7:39

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

2+阅读 · 今天7:33

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

1+阅读 · 今天7:28

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

1+阅读 · 今天7:14

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

17+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

7+阅读 · 6月15日

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

专知会员服务

8+阅读 · 6月15日

《网络空间兵棋推演：挑战、局限性与混合路径》报告

《网络空间兵棋推演：挑战、局限性与混合路径》报告

专知会员服务

8+阅读 · 6月15日

《离线语言支持系统：面向空战战术决策》

《离线语言支持系统：面向空战战术决策》

专知会员服务

8+阅读 · 6月15日

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

专知会员服务

6+阅读 · 6月15日

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

专知会员服务

6+阅读 · 6月14日

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

6+阅读 · 6月14日

俄乌战场地面机器人如何改写战争规则

俄乌战场地面机器人如何改写战争规则

专知会员服务

9+阅读 · 6月14日

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

专知会员服务

13+阅读 · 6月14日

相关VIP内容

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

学习自然语言处理路线图

学习自然语言处理路线图

专知会员服务

140+阅读 · 2019年9月24日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《通过小型无人机系统将情报能力“作战化”》

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

《通用大语言模型：无人机指挥与控制接口》最新40页

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

Github项目推荐 | 股市预测的机器学习/深度学习模型/资源集锦

AI研习社

33+阅读 · 2019年4月18日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析

开放知识图谱

28+阅读 · 2018年6月11日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

相关论文

Simulating the economic impact of rationality through reinforcement learning and agent-based modelling

Arxiv

0+阅读 · 2024年5月3日

Compressing neural network by tensor network with exponentially fewer variational parameters

Arxiv

0+阅读 · 2024年5月3日

How A/B testing changes the dynamics of information spreading on a social network

Arxiv

0+阅读 · 2024年5月2日

Quantum cryptographic protocols with dual messaging system via 2D alternate quantum walks and genuine single particle entangled states

Quantum cryptographic protocols with dual messaging system via 2D alternate quantum walks and genuine single particle entangled states

Arxiv

0+阅读 · 2024年5月1日

How founder motivations, goals, and actions influence early trajectories of online communities

Arxiv

0+阅读 · 2024年5月1日

Metric geometry of the privacy-utility tradeoff

Arxiv

0+阅读 · 2024年5月1日

Correcting misinformation on social media with a large language model

Arxiv

0+阅读 · 2024年4月30日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Arxiv

102+阅读 · 2022年5月11日

How to represent part-whole hierarchies in a neural network

Arxiv

13+阅读 · 2021年2月25日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

相关基金

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

47+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

随机二阶锥互补问题理论与算法研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

考虑一般约束条件下的消费投资决策模型研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

科学合作网络的不连通问题研究

国家自然科学基金

3+阅读 · 2014年12月31日

复杂多元数据的半参数统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员