Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models

As the performance of larger, newer Large Language Models continues to improve for strategic Theory of Mind (ToM) tasks, the demand for these state of the art models increases commensurately. However, their deployment is costly both in terms of processing power and time. In this paper, we investigate the feasibility of creating smaller, simulation-ready agents by way of fine-tuning. To do this, we present a large pre-trained model with 20 unique scenarios that combine a social context with a social dilemma, recording its answers, and using them for Q\&A fine-tuning on a smaller model of the same family. Our focus is on in-context game-theoretic decision-making, the same domain within which human interaction occurs and that requires both a theory of mind (or a semblance thereof) and an understanding of social dynamics. We find that the fine-tuned smaller language model exhibited significant performance closer to that of its larger relative, and that their improvements extended in areas and contexts beyond the ones provided in the training examples. On average for all games, through fine-tuning, the smaller model showed a \%46 improvement in aligning with the behavior of the larger model, with \%100 representing complete alignment. This suggests that our pipeline represents an efficient method to transmit some form of theory of mind to smaller models, creating improved and cheaply deployable algorithms in the process. Despite their simplicity and their associated shortcomings and limitations, our findings represent a stepping stone in the pursuit and training of specialized models for strategic and social decision making.

翻译：随着规模更大、更新的语言模型在战略性心智理论任务上的性能持续提升，对这些前沿模型的需求也相应增长。然而，其部署在计算能力和时间方面成本高昂。本文研究了通过微调创建更小、可模拟智能体的可行性。为此，我们为一个大预训练模型提供了20个独特场景，每个场景结合了社会背景与社会困境，记录其答案，并利用这些答案对同系列的一个较小模型进行问答微调。我们的研究聚焦于情境化博弈论决策——这正是人类互动发生的领域，且需要心智理论（或其近似能力）以及对社交动态的理解。我们发现，经过微调的较小语言模型表现出显著提升的性能，更接近其较大版本，且其改进延伸至训练示例未涵盖的领域和情境。在所有游戏的平均表现中，通过微调，较小模型与较大模型行为对齐度提升了46%（100%代表完全对齐）。这表明我们的流程代表了一种向较小模型传递某种形式心智理论的有效方法，在此过程中创造了性能更优且部署成本更低的算法。尽管方法简单且存在相应的缺陷与局限，我们的发现为追求和训练用于战略及社会决策的专用模型迈出了重要一步。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日