Sequential Resource Trading Using Comparison-Based Gradient Estimation - 专知论文

会员服务 ·

0

Agent · 估计/估计量 · 泛函 · 情景 · INTERACT ·

2024 年 11 月 3 日

Sequential Resource Trading Using Comparison-Based Gradient Estimation

翻译：基于比较的梯度估计的顺序资源交易

Surya Murthy,Mustafa O. Karabag,Ufuk Topcu

Autonomous agents interact with other agents of unknown preferences to share resources in their environment. We explore sequential trading for resource allocation in a setting where two greedily rational agents sequentially trade resources from a finite set of categories. Each agent has a utility function that depends on the amount of resources it possesses in each category. The offering agent makes trade offers to improve its utility without knowing the responding agent's utility function, and the responding agent only accepts offers that improve its utility. We present an algorithm for the offering agent to estimate the responding agent's gradient (preferences) and make offers based on previous acceptance or rejection responses. The algorithm's goal is to reach a Pareto-optimal resource allocation state while ensuring that the utilities of both agents improve after every accepted trade. We show that, after a finite number of consecutively rejected offers, the responding agent is at a near-optimal state, or the agents' gradients are closely aligned. We compare the proposed algorithm against various baselines in continuous and discrete trading scenarios and show that it improves the societal benefit with fewer offers.

翻译：自主智能体与偏好未知的其他智能体交互，以共享其环境中的资源。我们研究了一种顺序交易机制，用于在有限资源类别集合中，两个贪婪理性智能体顺序交易资源的场景。每个智能体具有一个效用函数，该函数取决于其在每个类别中拥有的资源量。提出交易的智能体在不知道响应智能体效用函数的情况下提出交易提议以提升自身效用，而响应智能体仅接受能提升其效用的提议。我们提出了一种算法，使提议智能体能够估计响应智能体的梯度（偏好），并根据先前接受或拒绝的响应来制定提议。该算法的目标是达到帕累托最优的资源分配状态，同时确保每次被接受的交易后两个智能体的效用均得到提升。我们证明，在连续有限次被拒绝的提议后，响应智能体将处于接近最优的状态，或者两个智能体的梯度将高度对齐。我们在连续和离散交易场景中将所提算法与多种基线方法进行比较，结果表明该算法能以更少的提议次数提升社会福利。

0

相关内容

Agent

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

语义Web知识库补全关键技术研究

国家自然科学基金

18+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

太阳活动11年周期影响春季NAO与东亚夏季风关系的过程及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

Exploring Vacant Classes in Label-Skewed Federated Learning

Arxiv

1+阅读 · 2024年12月16日

Semi-Supervised Community Detection via Quasi-Stationary Distributions

Arxiv

1+阅读 · 2024年12月13日

Obfuscated Activations Bypass LLM Latent-Space Defenses

Arxiv

1+阅读 · 2024年12月12日

Falcon-UI: Understanding GUI Before Following User Instructions

Arxiv

1+阅读 · 2024年12月12日

Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension

Arxiv

1+阅读 · 2024年12月12日

Continuous Gaussian Process Pre-Optimization for Asynchronous Event-Inertial Odometry

Arxiv

1+阅读 · 2024年12月12日

Extracting Database Access-control Policies From Web Applications

Arxiv

1+阅读 · 2024年12月11日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Arxiv

19+阅读 · 2021年4月19日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

VIP会员

文章信息

相关主题

估计/估计量

最新内容

2025年大语言模型进展报告

2025年大语言模型进展报告

专知会员服务

1+阅读 · 今天13:30

多智能体协作机制

多智能体协作机制

专知会员服务

1+阅读 · 今天13:26

非对称优势：美海军开发低成本反无人机技术

非对称优势：美海军开发低成本反无人机技术

专知会员服务

4+阅读 · 今天4:39

《反无人机技术领域的技术发展综述：C-UAS探测、跟踪与识别技术》80页报告

《反无人机技术领域的技术发展综述：C-UAS探测、跟踪与识别技术》80页报告

专知会员服务

14+阅读 · 今天2:52

《美战争部小企业创新研究（SBIR）计划》

《美战争部小企业创新研究（SBIR）计划》

专知会员服务

6+阅读 · 今天2:48

《军事模拟：将军事条令与目标融入AI智能体》

《军事模拟：将军事条令与目标融入AI智能体》

专知会员服务

9+阅读 · 今天2:43

【NTU博士论文】3D人体动作生成

【NTU博士论文】3D人体动作生成

专知会员服务

7+阅读 · 4月24日

DeepSeek-V4：百万 Token 上下文背后，大模型正在进入“长程智能”时代（附中英文pdf版）

DeepSeek-V4：百万 Token 上下文背后，大模型正在进入“长程智能”时代（附中英文pdf版）

专知会员服务

8+阅读 · 4月24日

以色列军事技术对美国军力发展的持续性赋能

以色列军事技术对美国军力发展的持续性赋能

专知会员服务

8+阅读 · 4月24日

战场之外的较量：美伊冲突中的认知战与心理博弈

战场之外的较量：美伊冲突中的认知战与心理博弈

专知会员服务

6+阅读 · 4月24日

俄乌战争中乌克兰防空能力演变与见解（中文版）

俄乌战争中乌克兰防空能力演变与见解（中文版）

专知会员服务

7+阅读 · 4月24日

《面向巡飞弹药系统的情境感知深度强化学习自主非线性机动控制》

《面向巡飞弹药系统的情境感知深度强化学习自主非线性机动控制》

专知会员服务

10+阅读 · 4月24日

《深度强化学习在兵棋推演中的应用》40页报告

《深度强化学习在兵棋推演中的应用》40页报告

专知会员服务

14+阅读 · 4月24日

《多域作战面临复杂现实》

《多域作战面临复杂现实》

专知会员服务

10+阅读 · 4月24日

《印度的多域作战：条令与能力发展》报告

《印度的多域作战：条令与能力发展》报告

专知会员服务

5+阅读 · 4月24日

相关VIP内容

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

多智能体协作机制

《反无人机技术领域的技术发展综述：C-UAS探测、跟踪与识别技术》80页报告

2025年大语言模型进展报告

非对称优势：美海军开发低成本反无人机技术

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Exploring Vacant Classes in Label-Skewed Federated Learning

Arxiv

1+阅读 · 2024年12月16日

Semi-Supervised Community Detection via Quasi-Stationary Distributions

Arxiv

1+阅读 · 2024年12月13日

Obfuscated Activations Bypass LLM Latent-Space Defenses

Arxiv

1+阅读 · 2024年12月12日

Falcon-UI: Understanding GUI Before Following User Instructions

Arxiv

1+阅读 · 2024年12月12日

Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension

Arxiv

1+阅读 · 2024年12月12日

Continuous Gaussian Process Pre-Optimization for Asynchronous Event-Inertial Odometry

Arxiv

1+阅读 · 2024年12月12日

Extracting Database Access-control Policies From Web Applications

Arxiv

1+阅读 · 2024年12月11日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Arxiv

19+阅读 · 2021年4月19日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

语义Web知识库补全关键技术研究

国家自然科学基金

18+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

DMB信号水汽探测方法若干问题研究

国家自然科学基金

3+阅读 · 2015年12月31日

太阳活动11年周期影响春季NAO与东亚夏季风关系的过程及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于决策模型和预备电位的运动想象BCI研究

国家自然科学基金

3+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Neolaxiflorin B的全合成研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员