Human Choice Prediction in Language-based Non-Cooperative Games: Simulation-based Off-Policy Evaluation - 专知论文

会员服务 ·

0

INTERACT · MoDELS · Integration · 测试数据 · 情景 ·

2023 年 5 月 23 日

Human Choice Prediction in Language-based Non-Cooperative Games: Simulation-based Off-Policy Evaluation

翻译：基于语言非合作博弈中的人类选择预测：基于模拟的离策略评估

Eilam Shapira,Reut Apel,Moshe Tennenholtz,Roi Reichart

Persuasion games have been fundamental in economics and AI research, and have significant practical applications. Recent works in this area have started to incorporate natural language, moving beyond the traditional stylized message setting. However, previous research has focused on on-policy prediction, where the train and test data have the same distribution, which is not representative of real-life scenarios. In this paper, we tackle the challenging problem of off-policy evaluation (OPE) in language-based persuasion games. To address the inherent difficulty of human data collection in this setup, we propose a novel approach which combines real and simulated human-bot interaction data. Our simulated data is created by an exogenous model assuming decision makers (DMs) start with a mixture of random and decision-theoretic based behaviors and improve over time. We present a deep learning training algorithm that effectively integrates real interaction and simulated data, substantially improving over models that train only with interaction data. Our results demonstrate the potential of real interaction and simulation mixtures as a cost-effective and scalable solution for OPE in language-based persuasion games.\footnote{Our code and the large dataset we collected and generated are submitted as supplementary material and will be made publicly available upon acceptance.

翻译：说服博弈在经济学和人工智能研究中具有基础性地位，且拥有重要的实际应用。近期该领域的研究已开始融入自然语言，突破了传统程式化信息设置的局限。然而，先前研究主要集中在同策略预测上，即训练数据与测试数据具有相同分布，这并不能代表现实场景。本文针对语言型说服博弈中离策略评估这一具有挑战性的问题展开研究。为了解决该设定下人类数据收集的固有困难，我们提出了一种融合真实人与机器人交互数据与模拟数据的新方法。模拟数据由外生模型生成，该模型假设决策者最初混合采用随机行为与基于决策理论的行为，并随时间推移逐步优化。我们提出了一种深度学习训练算法，该算法能有效整合真实交互数据与模拟数据，其在性能上显著优于仅使用交互数据训练的模型。我们的研究结果表明，真实交互与模拟数据的混合方案作为语言型说服博弈中离策略评估的一种经济高效且可扩展的解决方案具有巨大潜力。\footnote{我们的代码及所收集生成的大规模数据集已作为补充材料提交，并在接收后公开发布。}

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

247+阅读 · 2019年10月21日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

TRAF3IP3调控T细胞活性与肿瘤免疫的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

G3BP2在调控肺癌细胞塑型中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

γ-Synuclein调控MAPK-ERK-JNK信号通路及细胞周期促进子宫内膜癌恶性进展的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

β肾上腺素能受体介导的信号通路对口腔鳞癌细胞的调控及其分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PIAS3在和厚朴酚诱导肿瘤细胞凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

TGF-β1通路调控MET在滑膜肉瘤双相分化和侵袭转移中作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

增强子RNA对胃癌肝素酶基因表达的调控作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤细胞中凋亡抑制蛋白CFLAR乙酰化调控的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

NF-kB转录活化miR-130b协同促进PKCα促膀胱癌细胞存活机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

On the Predictive Accuracy of Neural Temporal Point Process Models for Continuous-time Event Data

Arxiv

0+阅读 · 2023年7月10日

Can Large Language Models Write Good Property-Based Tests?

Arxiv

0+阅读 · 2023年7月10日

GP-guided MPPI for Efficient Navigation in Complex Unknown Cluttered Environments

Arxiv

0+阅读 · 2023年7月8日

Large Language Models for Supply Chain Optimization

Arxiv

0+阅读 · 2023年7月8日

Joint Perceptual Learning for Enhancement and Object Detection in Underwater Scenarios

Arxiv

0+阅读 · 2023年7月7日

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年7月7日

Guiding Large Language Models via Directional Stimulus Prompting

Arxiv

1+阅读 · 2023年7月7日

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Arxiv

37+阅读 · 2023年3月7日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

VIP会员

文章信息

相关主题

最新内容

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

专知会员服务

4+阅读 · 7月23日

《基于强化学习的自动化红队测试》

《基于强化学习的自动化红队测试》

专知会员服务

3+阅读 · 7月23日

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

专知会员服务

5+阅读 · 7月23日

“天降毒雾”：无人机如何使化学战重返乌克兰战场

“天降毒雾”：无人机如何使化学战重返乌克兰战场

专知会员服务

2+阅读 · 7月23日

伊朗不对称防空战略的演进

伊朗不对称防空战略的演进

专知会员服务

4+阅读 · 7月23日

对抗环境下超视距目标打击的情报支援

对抗环境下超视距目标打击的情报支援

专知会员服务

10+阅读 · 7月22日

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

《面向复杂地形下无人机跟踪地面机器人（UAV–UGV）的自适应多滤波器扩展卡尔曼滤波框架》

专知会员服务

4+阅读 · 7月22日

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

纵深侦察：大规模作战行动中远程侦察与监视之迫切需求

专知会员服务

8+阅读 · 7月22日

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

共享认知，分布式研判：复杂行动中的美国空军指挥控制（万字长文）

专知会员服务

10+阅读 · 7月22日

《无人机对海面作战影响评估》

《无人机对海面作战影响评估》

专知会员服务

15+阅读 · 7月21日

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

《可损耗无人系统规模化应用对美国军事转型的战略影响（2022-2030）》2026年270页

专知会员服务

14+阅读 · 7月21日

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

博士论文 | 后训练如何损害大模型生成多样性？SimpleStrat与Stylus

专知会员服务

4+阅读 · 7月21日

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

综述 | 面向5G/6G网络的LLM智能体AI：架构、协议与标准化

专知会员服务

6+阅读 · 7月21日

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

五角大楼新设无人机办公室（DRPM-UxS）将如何重塑美国无人系统格局（附美国防部设立备忘录）

专知会员服务

9+阅读 · 7月21日

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

7+阅读 · 7月20日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

247+阅读 · 2019年10月21日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于强化学习的自动化红队测试》

“天降毒雾”：无人机如何使化学战重返乌克兰战场

《远程自主系统可扩展态势感知的解决方案》32页2026最新报告

《下一代无人机-卫星通信：人工智能创新与未来展望》32页长综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

【论文推荐】最新六篇生成式对抗网络（GAN）相关论文—半监督学习、对偶、交互生成对抗网络、激活、纳什均衡、tempoGAN

专知

23+阅读 · 2018年2月23日

相关论文

On the Predictive Accuracy of Neural Temporal Point Process Models for Continuous-time Event Data

Arxiv

0+阅读 · 2023年7月10日

Can Large Language Models Write Good Property-Based Tests?

Arxiv

0+阅读 · 2023年7月10日

GP-guided MPPI for Efficient Navigation in Complex Unknown Cluttered Environments

Arxiv

0+阅读 · 2023年7月8日

Large Language Models for Supply Chain Optimization

Arxiv

0+阅读 · 2023年7月8日

Joint Perceptual Learning for Enhancement and Object Detection in Underwater Scenarios

Arxiv

0+阅读 · 2023年7月7日

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

Arxiv

0+阅读 · 2023年7月7日

Guiding Large Language Models via Directional Stimulus Prompting

Arxiv

1+阅读 · 2023年7月7日

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Arxiv

37+阅读 · 2023年3月7日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Arxiv

12+阅读 · 2021年6月8日

相关基金

TRAF3IP3调控T细胞活性与肿瘤免疫的分子机制

国家自然科学基金

0+阅读 · 2016年12月31日

G3BP2在调控肺癌细胞塑型中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

γ-Synuclein调控MAPK-ERK-JNK信号通路及细胞周期促进子宫内膜癌恶性进展的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

β肾上腺素能受体介导的信号通路对口腔鳞癌细胞的调控及其分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PIAS3在和厚朴酚诱导肿瘤细胞凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

TGF-β1通路调控MET在滑膜肉瘤双相分化和侵袭转移中作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

增强子RNA对胃癌肝素酶基因表达的调控作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤细胞中凋亡抑制蛋白CFLAR乙酰化调控的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

NF-kB转录活化miR-130b协同促进PKCα促膀胱癌细胞存活机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员