As the performance of larger, newer Large Language Models continues to improve for strategic Theory of Mind (ToM) tasks, the demand for these state-of-the-art models increases commensurately. However, their deployment is costly both in terms of processing power and time. In this paper, we investigate the feasibility of creating smaller, highly-performing specialized algorithms by way of fine-tuning. To do this, we first present a large pre-trained model with 20 unique scenarios that combine different social contexts with games of varying social dilemmas, record its answers, and use them for Q&A fine-tuning on a smaller model of the same family. Our focus is on in-context game-theoretic decision-making, the same domain within which human interaction occurs and that requires both a theory of mind (or a semblance thereof) and an understanding of social dynamics. The smaller model is therefore trained not just on the answers provided, but also on the motivations provided by the larger model, which should contain advice and guidelines to navigate both strategic dilemmas and social cues. We find that the fine-tuned smaller language model consistently bridged the gap in performance between the smaller pre-trained version of the model and its larger relative and that its improvements extended in areas and contexts beyond the ones provided in the training examples, including on out-of-sample scenarios that include completely different game structures. On average for all games, through fine-tuning, the smaller model showed a 46% improvement measured as alignment towards the behavior of the larger model, with 100% representing indistinguishable behavior. When presented with out-of-sample social contexts and games, the fine-tuned model still displays remarkable levels of alignment, reaching an improvement of 18% and 28% respectively.
翻译:随着更大、更新的语言模型在战略心智理论任务上的性能持续提升,对这些前沿模型的需求也相应增长。然而,其部署在计算能力和时间方面均成本高昂。本文研究了通过微调创建更小、高性能专用算法的可行性。为此,我们首先构建了一个包含20个独特场景的大型预训练模型,这些场景将不同社会情境与具有多样化社会困境的游戏相结合,记录其答案,并用于对同系列较小模型进行问答微调。我们的研究聚焦于上下文博弈论决策制定——这正是人类互动发生的领域,既需要心智理论(或其近似能力),也需要对社会动态的理解。因此,较小模型不仅基于大模型提供的答案进行训练,还学习大模型给出的动机解释,这些解释应包含应对战略困境和社会线索的建议与指导。研究发现,经过微调的较小语言模型持续缩小了其预训练版本与较大模型之间的性能差距,且其改进效果延伸至训练示例之外的领域和情境,包括游戏结构完全不同的样本外场景。在所有游戏的平均表现中,通过微调,较小模型显示出46%的性能提升(以与大模型行为对齐度衡量,100%代表行为不可区分)。当面对样本外社会情境和游戏时,微调模型仍表现出显著的对齐水平,分别达到18%和28%的改进幅度。