Can Base ChatGPT be Used for Forecasting without Additional Optimization?

This study investigates whether OpenAI's ChatGPT-3.5 and ChatGPT-4 can forecast future events. To evaluate the accuracy of the predictions, we take advantage of the fact that the training data at the time of our experiments (mid 2023) stopped at September 2021, and ask about events that happened in 2022. We employed two prompting strategies: direct prediction and what we call future narratives which ask ChatGPT to tell fictional stories set in the future with characters retelling events that happened in the past, but after ChatGPT's training data had been collected. We prompted ChatGPT to engage in storytelling, particularly within economic contexts. After analyzing 100 trials, we find that future narrative prompts significantly enhanced ChatGPT-4's forecasting accuracy. This was especially evident in its predictions of major Academy Award winners as well as economic trends, the latter inferred from scenarios where the model impersonated public figures like the Federal Reserve Chair, Jerome Powell. As a falsification exercise, we repeated our experiments in May 2024 at which time the models included more recent training data. ChatGPT-4's accuracy significantly improved when the training window included the events being prompted for, achieving 100% accuracy in many instances. The poorer accuracy for events outside of the training window suggests that in the 2023 prediction experiments, ChatGPT-4 was forming predictions based solely on its training data. Narrative prompting also consistently outperformed direct prompting. These findings indicate that narrative prompts leverage the models' capacity for hallucinatory narrative construction, facilitating more effective data synthesis and extrapolation than straightforward predictions. Our research reveals new aspects of LLMs' predictive capabilities and suggests potential future applications in analytical contexts.

翻译：本研究探讨了OpenAI的ChatGPT-3.5与ChatGPT-4是否能够预测未来事件。为评估预测准确性，我们利用了实验时（2023年中）训练数据截止于2021年9月这一事实，针对2022年发生的事件进行提问。我们采用了两种提示策略：直接预测法，以及我们称为"未来叙事"的方法——要求ChatGPT讲述设定在未来的虚构故事，让故事角色复述发生在过去（但在ChatGPT训练数据收集截止之后）的事件。我们引导ChatGPT主要在经济学语境下进行叙事创作。通过分析100次试验发现，未来叙事提示策略显著提升了ChatGPT-4的预测准确率。这在奥斯卡主要奖项获奖者的预测以及经济趋势推断中尤为明显——后者通过让模型模拟美联储主席杰罗姆·鲍威尔等公众人物来实现。作为证伪检验，我们在2024年5月使用包含更新训练数据的模型重复实验。当训练窗口涵盖所提示事件时，ChatGPT-4的准确率显著提升，在多类场景中达到100%准确率。对于训练窗口外的事件，其预测准确率较低，这表明在2023年的预测实验中，ChatGPT-4仅能依据训练数据形成预测。叙事提示法的表现也持续优于直接提示法。这些发现表明，叙事提示能够利用模型构建虚构叙事的能力，相比直接预测更能促进有效的数据合成与外推。本研究揭示了大型语言模型预测能力的新维度，并为其在分析场景中的未来应用提供了潜在方向。