This paper investigates the ability of Large Language Models (LLMs), specifically GPT-3.5-turbo (GPT), to form inflation perceptions and expectations based on macroeconomic price signals. We compare the LLM's output to household survey data and official statistics, mimicking the information set and demographic characteristics of the Bank of England's Inflation Attitudes Survey (IAS). Our quasi-experimental design exploits the timing of GPT's training cut-off in September 2021 which means it has no knowledge of the subsequent UK inflation surge. We find that GPT tracks aggregate survey projections and official statistics at short horizons. At a disaggregated level, GPT replicates key empirical regularities of households' inflation perceptions, particularly for income, housing tenure, and social class. A novel Shapley value decomposition of LLM outputs suited for the synthetic survey setting provides well-defined insights into the drivers of model outputs linked to prompt content. We find that GPT demonstrates a heightened sensitivity to food inflation information similar to that of human respondents. However, we also find that it lacks a consistent model of consumer price inflation. More generally, our approach could be used to evaluate the behaviour of LLMs for use in the social sciences, to compare different models, or to assist in survey design.
翻译:本文研究了大型语言模型(LLMs),特别是GPT-3.5-turbo(GPT),基于宏观经济价格信号形成通胀感知与预期的能力。我们将LLM的输出与家庭调查数据及官方统计数据进行比较,模拟了英国央行通胀态度调查(IAS)的信息集与人口特征。我们的准实验设计利用了GPT训练截止时间(2021年9月)的特性,这意味着模型对后续英国通胀飙升并无认知。研究发现,GPT在短期范围内能够追踪总体调查预测与官方统计数据。在细分层面,GPT复现了家庭通胀感知的关键经验规律,尤其在收入、住房持有状况和社会阶层方面表现显著。我们提出了一种适用于合成调查场景的LLM输出Shapley值分解方法,为提示内容相关的模型输出驱动因素提供了清晰解释。研究发现,GPT对食品通胀信息表现出与人类受访者相似的高度敏感性。然而,研究也发现其缺乏对消费者价格通胀的一致性建模框架。总体而言,本方法可用于评估社会科学领域LLM的行为特性、比较不同模型性能,或为调查设计提供辅助。