Recent works have increasingly applied Large Language Models (LLMs) as agents in financial stock market simulations to test if micro-level behaviors aggregate into macro-level phenomena. However, a crucial question arises: Do LLM agents' behaviors align with real market participants? This alignment is key to the validity of simulation results. To explore this, we select a financial stock market scenario to test behavioral consistency. Investors are typically classified as fundamental or technical traders, but most simulations fix strategies at initialization, failing to reflect real-world trading dynamics. In this work, we assess whether agents' strategy switching aligns with financial theory, providing a framework for this evaluation. We operationalize four behavioral-finance drivers-loss aversion, herding, wealth differentiation, and price misalignment-as personality traits set via prompting and stored long-term. In year-long simulations, agents process daily price-volume data, trade under a designated style, and reassess their strategy every 10 trading days. We introduce four alignment metrics and use Mann-Whitney U tests to compare agents' style-switching behavior with financial theory. Our results show that recent LLMs' switching behavior is only partially consistent with behavioral-finance theories, highlighting the need for further refinement in aligning agent behavior with financial theory.
翻译:近期研究日益将大语言模型作为智能体应用于金融股票市场模拟,以检验微观行为是否聚合为宏观现象。然而,一个关键问题随之浮现:大语言模型智能体的行为是否与真实市场参与者一致?这种一致性对模拟结果的有效性至关重要。为探究此问题,我们选取金融股票市场场景进行行为一致性测试。投资者通常被划分为基本面交易者或技术交易者,但多数模拟在初始化时固定策略,未能反映真实市场的交易动态。本研究通过评估智能体的策略切换是否符合金融理论,构建了该评估框架。我们将四种行为金融驱动因素——损失厌恶、羊群效应、财富分化及价格偏离——操作为人格特征,通过提示设置并存储在长期记忆中。在为期一年的模拟中,智能体处理每日量价数据,按指定风格交易,每10个交易日重新评估策略。我们引入四种对齐指标,并采用Mann-Whitney U检验将智能体的风格切换行为与金融理论进行比较。结果表明,当前大语言模型的切换行为仅部分符合行为金融理论,凸显了在将智能体行为与金融理论对齐方面尚需进一步优化。