Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models

Language models have shown unprecedented capabilities, sparking debate over the source of their performance. Is it merely the outcome of learning syntactic patterns and surface level statistics, or do they extract semantics and a world model from the text? Prior work by Li et al. investigated this by training a GPT model on synthetic, randomly generated Othello games and found that the model learned an internal representation of the board state. We extend this work into the more complex domain of chess, training on real games and investigating our model's internal representations using linear probes and contrastive activations. The model is given no a priori knowledge of the game and is solely trained on next character prediction, yet we find evidence of internal representations of board state. We validate these internal representations by using them to make interventions on the model's activations and edit its internal board state. Unlike Li et al's prior synthetic dataset approach, our analysis finds that the model also learns to estimate latent variables like player skill to better predict the next character. We derive a player skill vector and add it to the model, improving the model's win rate by up to 2.6 times.

翻译：语言模型展现出前所未有的能力，引发了关于其性能来源的争论：这仅仅是学习句法模式和表层统计的结果，还是模型从文本中提取了语义信息并构建了世界模型？Li等人先前的研究通过使用随机生成的奥赛罗棋合成数据训练GPT模型，发现模型习得了棋盘状态的内部表征。我们将此项研究拓展至更复杂的象棋领域，使用真实对局数据进行训练，并通过线性探针与对比激活方法探究模型的内部表征。模型未获得任何关于象棋的先验知识，仅通过下一字符预测任务进行训练，但我们发现了其内部存在棋盘状态表征的证据。我们通过干预模型激活并编辑其内部棋盘状态来验证这些表征的有效性。与Li等人先前采用的合成数据集方法不同，我们的分析发现模型还学会了估计玩家水平等隐变量以提升下一字符预测性能。我们推导出玩家水平向量并将其融入模型，使模型的胜率最高提升至2.6倍。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日