Conversational AI systems are increasingly used for personal reflection and emotional disclosure, raising concerns about their effects on vulnerable users. Recent anecdotal reports suggest that prolonged interactions with AI may reinforce delusional thinking -- a phenomenon sometimes described as AI Psychosis. However, empirical evidence on this phenomenon remains limited. In this work, we examine how delusion-related language evolves during multi-turn interactions with conversational AI. We construct simulated users (SimUsers) from Reddit users' longitudinal posting histories and generate extended conversations with three model families (GPT, LLaMA, and Qwen). We develop DelusionScore, a linguistic measure that quantifies the intensity of delusion-related language across conversational turns. We find that SimUsers derived from users with prior delusion-related discourse (Treatment) exhibit progressively increasing DelusionScore trajectories, whereas those derived from users without such discourse (Control) remain stable or decline. We further find that this amplification varies across themes, with reality skepticism and compulsive reasoning showing the strongest increases. Finally, conditioning AI responses on current DelusionScore substantially reduces these trajectories. These findings provide empirical evidence that conversational AI interactions can amplify delusion-related language over extended use and highlight the importance of state-aware safety mechanisms for mitigating such risks.
翻译:对话式 AI 系统越来越多地用于个人反思和情感披露,引发了对易受伤害用户影响的担忧。近期的传闻报告表明,与 AI 的长时间互动可能会强化妄想性思维——这种现象有时被称为“AI 精神病”。然而,关于这一现象的经验证据仍然有限。在这项工作中,我们研究了在与对话式 AI 的多轮互动中,与妄想相关的语言是如何演变的。我们利用 Reddit 用户的纵向发帖历史构建模拟用户(SimUsers),并生成与三个模型系列(GPT、LLaMA 和 Qwen)的扩展对话。我们开发了 DelusionScore,这是一种语言度量,用于量化跨对话轮次中与妄想相关语言的强度。我们发现,来自先前有妄想相关话语(处理组)Reddit 用户的 SimUsers 表现出逐渐增加的 DelusionScore 轨迹,而来自没有此类话语(对照组)的用户的 SimUsers 则保持稳定或下降。我们进一步发现,这种放大在各主题间有所不同,其中现实怀疑和强迫性推理显示出最强的增长。最后,根据当前 DelusionScore 对 AI 响应进行条件限制,显著减少了这些轨迹。这些发现提供了经验证据,表明对话式 AI 互动可以在长时间使用中放大与妄想相关的语言,并强调了状态感知安全机制对于缓解此类风险的重要性。