How do language models "think"? This paper formulates a probabilistic cognitive model called bounded pragmatic speaker, which can characterize the operation of different variants of language models. In particular, we show that large language models fine-tuned with reinforcement learning from human feedback (Ouyang et al., 2022) implements a model of thought that conceptually resembles a fast-and-slow model (Kahneman, 2011). We discuss the limitations of reinforcement learning from human feedback as a fast-and-slow model of thought and propose directions for extending this framework. Overall, our work demonstrates that viewing language models through the lens of cognitive probabilistic modeling can offer valuable insights for understanding, evaluating, and developing them.
翻译:语言模型如何“思考”?本文构建了一个名为“有界语用说话者”的概率认知模型,该模型可表征不同变体语言模型的运作机制。特别地,我们揭示了通过人类反馈强化学习微调的大型语言模型(Ouyang等,2022)实现了一种概念上类似于“快慢思维”模型(Kahneman,2011)的思维模式。我们讨论了将人类反馈强化学习作为快慢思维模型存在的局限性,并提出拓展该框架的研究方向。总体而言,本研究证明通过认知概率建模的视角审视语言模型,能为理解、评估和开发语言模型提供宝贵洞见。