Enhancing behavioral nudges with large language model-based iterative personalization: A field experiment on electricity and hot-water conservation

Nudging is widely used to promote behavioral change, but its effectiveness is often limited when recipients must repeatedly translate feedback into workable next steps under changing circumstances. Large language models (LLMs) may help reduce part of this cognitive work by generating personalized guidance and updating it iteratively across intervention rounds. We developed an LLM agent for iterative personalization and tested it in a three-arm randomized experiment among 233 university residents in China, using daily electricity and shower hot-water conservation as objectively measured cases differing in friction. LLM-personalized nudges (T2) produced the largest conservation effects, while image-enhanced conventional nudges (T1) and text-based conventional nudges (C) showed similar outcomes (omnibus p = 0.009). Relative to C, T2 reduced electricity consumption by 0.56 kWh per room-day (p = 0.014), corresponding to an 18.3 percentage-point higher adjusted saving rate. This advantage emerged within the first two intervention rounds, alongside iterative updating of personalized guidance, and persisted thereafter. Hot-water outcomes followed the same direction but were smaller, less precisely estimated, and attenuated over time, consistent with stronger friction in this domain. LLM-personalized nudges emphasized prospective and context-specific guidance and were associated with higher participant engagement. This study provides field evidence that LLM-based iterative personalization can enhance behavioral nudging, with behavioral friction as a potential boundary condition. Larger trials and extension to more behaviors are warranted.

翻译：助推被广泛用于促进行为改变，但若接收者需反复将反馈转化为适应情境变化的可行步骤，其效果往往受限。大型语言模型可通过生成个性化指导并在干预轮次中迭代更新，帮助减少部分此类认知负荷。我们开发了一种用于迭代个性化的LLM智能体，并在中国233名大学住户中开展了三项随机实验，以客观测量的日常用电与淋浴热水消耗为例（两者摩擦程度不同）进行验证。LLM个性化助推（T2）产生了最大的节能效果，而图像增强型传统助推（T1）与文本型传统助推（C）的效果相近（整体p = 0.009）。与C相比，T2使每房间每日用电量减少0.56千瓦时（p = 0.014），经调整的节能率高出18.3个百分点。该优势在干预的前两轮即显现（同时伴随个性化指导的迭代更新），并持续维持。热水消耗的结果方向一致，但效应更小、估计精度较低且随时间衰减，与该领域摩擦更强的特征相符。LLM个性化助推强调前瞻性与情境化指导，并与更高的参与者参与度相关。本研究提供了现场证据，表明基于LLM的迭代个性化可增强行为助推效果，而行为摩擦可能构成边界条件。未来需开展更大规模的试验并推广至更多行为领域。