We study how experience with asset price bubbles changes the trading strategies of reinforcement learning (RL) traders and ask whether the change in trading strategies helps to prevent future bubbles. We train the RL traders in a multi-agent market simulation platform, ABIDES, and compare the strategies of traders trained with and without bubble experience. We find that RL traders without bubble experience behave like short-term momentum traders, whereas traders with bubble experience behave like value traders. Therefore, RL traders without bubble experience amplify bubbles, whereas RL traders with bubble experience tend to suppress and sometimes prevent them. This finding suggests that learning from experience is a mechanism for a boom and bust cycle where the experience of a collapsing bubble makes future bubbles less likely for a period of time until the memory fades and bubbles become more likely to form again.
翻译:我们研究资产价格泡沫经历如何改变强化学习(RL)交易者的交易策略,并探讨交易策略的变化是否有助于预防未来泡沫。我们通过多智能体市场模拟平台ABIDES训练RL交易者,比较了有泡沫经历和无泡沫经历的交易者所习得策略的差异。研究发现,无泡沫经历的RL交易者行为类似短线动量交易者,而有泡沫经历的RL交易者则表现出价值交易者特征。因此,无泡沫经历的RL交易者会放大泡沫,而有泡沫经历的RL交易者则倾向于抑制甚至有时阻止泡沫的产生。这一发现表明,经验学习是导致繁荣-萧条周期的一种机制:泡沫破裂的经历会在未来一段时间内降低泡沫发生的可能性,直至该记忆逐渐消退,泡沫重新趋于形成。