This paper demonstrates how reinforcement learning can explain two puzzling empirical patterns in household consumption behavior during economic downturns. I develop a model where agents use Q-learning with neural network approximation to make consumption-savings decisions under income uncertainty, departing from standard rational expectations assumptions. The model replicates two key findings from recent literature: (1) unemployed households with previously low liquid assets exhibit substantially higher marginal propensities to consume (MPCs) out of stimulus transfers compared to high-asset households (0.50 vs 0.34), even when neither group faces borrowing constraints, consistent with Ganong et al. (2024); and (2) households with more past unemployment experiences maintain persistently lower consumption levels after controlling for current economic conditions, a "scarring" effect documented by Malmendier and Shen (2024). Unlike existing explanations based on belief updating about income risk or ex-ante heterogeneity, the reinforcement learning mechanism generates both higher MPCs and lower consumption levels simultaneously through value function approximation errors that evolve with experience. Simulation results closely match the empirical estimates, suggesting that adaptive learning through reinforcement learning provides a unifying framework for understanding how past experiences shape current consumption behavior beyond what current economic conditions would predict.
翻译:本文展示了强化学习如何解释经济衰退期间家庭消费行为中两个令人困惑的经验模式。我构建了一个模型,其中智能体使用带有神经网络近似的Q学习在收入不确定性下做出消费储蓄决策,从而偏离了标准理性预期假设。该模型复现了近期文献中的两个关键发现:(1) 拥有先前较低流动性资产的失业家庭,相比高资产家庭,对刺激转移支付的边际消费倾向(MPCs)显著更高(0.50 vs 0.34),即使两组家庭均未面临借贷约束,这与Ganong等人(2024)的研究一致;(2) 在控制当前经济条件后,拥有更多过往失业经历的家庭维持着持续更低的消费水平,这是Malmendier和Shen(2024)所记录的“疤痕”效应。与基于收入风险信念更新或事前异质性的现有解释不同,强化学习机制通过随经验演化的价值函数近似误差,同时产生了更高的MPCs和更低的消费水平。仿真结果与经验估计高度吻合,表明通过强化学习的自适应学习提供了一个统一框架,用于理解过往经历如何塑造超出当前经济条件所能预测的当前消费行为。