Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · GPT-3 · 路径 · 讲稿 ·

2023 年 2 月 14 日

Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies

翻译：使用大语言模型模拟多个个体并复现人类受试者研究

Gati Aher,Rosa I. Arriaga,Adam Tauman Kalai

from arxiv, Added Turing Experiment (TE) framing and Wisdom of Crowds TE

We introduce a new type of test, called a Turing Experiment (TE), for evaluating how well a language model, such as GPT-3, can simulate different aspects of human behavior. Unlike the Turing Test, which involves simulating a single arbitrary individual, a TE requires simulating a representative sample of participants in human subject research. We give TEs that attempt to replicate well-established findings in prior studies. We design a methodology for simulating TEs and illustrate its use to compare how well different language models are able to reproduce classic economic, psycholinguistic, and social psychology experiments: Ultimatum Game, Garden Path Sentences, Milgram Shock Experiment, and Wisdom of Crowds. In the first three TEs, the existing findings were replicated using recent models, while the last TE reveals a "hyper-accuracy distortion" present in some language models.

翻译：我们提出一种新型测试方法，称为图灵实验（Turing Experiment, TE），用于评估语言模型（如GPT-3）模拟人类行为不同方面的能力。与旨在模拟单个任意个体的图灵测试不同，TE要求模拟人类受试者研究中具有代表性的参与者样本。我们设计了能够复现先前研究中已得到充分验证结论的图灵实验，并构建了一套实施TE的方法论。通过该方法，我们比较了不同语言模型在复现经典经济学、心理语言学及社会心理学实验（包括：最后通牒博弈、花园路径句、米尔格拉姆电击实验、群体智慧）中的表现。前三项TE中，现有结论可通过最新模型复现，而最后一项TE揭示了部分语言模型中存在的“超精准失真”现象。

0

相关内容

语言模型化

语言模型化

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

靶向控释siRNA调控S6K1选择性剪接逆转非小细胞肺癌放疗抵抗的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

δ-catenin对非小细胞肺癌凋亡的影响及其机制初探

国家自然科学基金

0+阅读 · 2012年12月31日

Plk2与ArgBP2相互作用在骨肉瘤细胞迁移中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国碎米蕨类（cheilanthoid ferns）的系统学研究

国家自然科学基金

0+阅读 · 2012年12月31日

血清-巨噬细胞间胆固醇流动与动脉粥样硬化关系研究

国家自然科学基金

0+阅读 · 2011年12月31日

Puma和Bim在慢性淋巴细胞白血病细胞凋亡中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

IER5基因调节宫颈癌放疗敏感性的功能及其作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pim激酶家族参与Abl介导细胞转化的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

Behavioral estimates of conceptual structure are robust across tasks in humans but not large language models

Arxiv

0+阅读 · 2023年4月5日

PromptAid: Prompt Exploration, Perturbation, Testing and Iteration using Visual Analytics for Large Language Models

Arxiv

0+阅读 · 2023年4月4日

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

Arxiv

0+阅读 · 2023年4月4日

Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT

Arxiv

3+阅读 · 2023年4月3日

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study

Arxiv

0+阅读 · 2023年4月3日

Zero-shot meta-learning for small-scale data from human subjects

Arxiv

0+阅读 · 2023年4月1日

Large language models can rate news outlet credibility

Arxiv

0+阅读 · 2023年4月1日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

VIP会员

文章信息

相关主题

语言模型化

最新内容

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

《防空系统对自主武器系统辩论中“有意义的人类控制”的启示》70页报告

专知会员服务

0+阅读 · 5分钟前

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

专知会员服务

1+阅读 · 9分钟前

《同步多无人机系统中的故障与通信》

《同步多无人机系统中的故障与通信》

专知会员服务

1+阅读 · 今天6:23

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

专知会员服务

2+阅读 · 7月28日

博士论文 | 从算法到基础模型：强化学习的统一视角

博士论文 | 从算法到基础模型：强化学习的统一视角

专知会员服务

6+阅读 · 7月28日

面向国防作战的最佳自主与蜂群无人机技术

面向国防作战的最佳自主与蜂群无人机技术

专知会员服务

7+阅读 · 7月28日

《异构人类团队的协作决策过程混合建模研究》

《异构人类团队的协作决策过程混合建模研究》

专知会员服务

7+阅读 · 7月28日

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

《C5ISR系统中的注意力动态与自适应决策支持研究：视觉与多模态注意力引导对任务绩效影响的递归量化分析》最新36页报告

专知会员服务

8+阅读 · 7月28日

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

《设计思维中的人机协作：生成式人工智能对共情访谈影响的探究》140页

专知会员服务

9+阅读 · 7月28日

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

5+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

9+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

14+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

9+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

8+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

7+阅读 · 7月27日

相关VIP内容

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《同步多无人机系统中的故障与通信》

博士论文 | 从算法到基础模型：强化学习的统一视角

“对标ChatGPT”：乌军研发Marichka AI系统用于战场筹划

论文解读 | 医学图像修复中的扩散模型：挑战、分类与未来方向

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

相关论文

Behavioral estimates of conceptual structure are robust across tasks in humans but not large language models

Arxiv

0+阅读 · 2023年4月5日

PromptAid: Prompt Exploration, Perturbation, Testing and Iteration using Visual Analytics for Large Language Models

Arxiv

0+阅读 · 2023年4月4日

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

Arxiv

0+阅读 · 2023年4月4日

Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT

Arxiv

3+阅读 · 2023年4月3日

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study

Arxiv

0+阅读 · 2023年4月3日

Zero-shot meta-learning for small-scale data from human subjects

Arxiv

0+阅读 · 2023年4月1日

Large language models can rate news outlet credibility

Arxiv

0+阅读 · 2023年4月1日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

靶向控释siRNA调控S6K1选择性剪接逆转非小细胞肺癌放疗抵抗的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

δ-catenin对非小细胞肺癌凋亡的影响及其机制初探

国家自然科学基金

0+阅读 · 2012年12月31日

Plk2与ArgBP2相互作用在骨肉瘤细胞迁移中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国碎米蕨类（cheilanthoid ferns）的系统学研究

国家自然科学基金

0+阅读 · 2012年12月31日

血清-巨噬细胞间胆固醇流动与动脉粥样硬化关系研究

国家自然科学基金

0+阅读 · 2011年12月31日

Puma和Bim在慢性淋巴细胞白血病细胞凋亡中的作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

IER5基因调节宫颈癌放疗敏感性的功能及其作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pim激酶家族参与Abl介导细胞转化的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员