Modern AI agents increasingly combine conversational interaction with autonomous task execution, such as coding and web research, raising a natural question: What happens when an agent engaged in long-horizon tasks is exposed to user persuasion? Yet studying this possibility is challenging because long-running agent behavior is noisy and costly to reproduce, and it remains unclear which unique challenges emerge only in extended task execution. We study how belief-level intervention can influence downstream task behavior, a phenomenon we name persuasion propagation. We introduce a behavior-centered evaluation framework that distinguishes between persuasion applied during or prior to task execution. Across web research and coding tasks, we find that on-the-fly persuasion induces weak and inconsistent behavioral effects. In contrast, when the belief state is explicitly specified at task time, belief-prefilled agents conduct on average 26.9% fewer searches and visit 16.9% fewer unique sources than neutral-prefilled agents. These results suggest that persuasion, even in prior interaction, can affect the agent's behavior, motivating behavior-level evaluation in agentic systems.
翻译:现代人工智能智能体(AI agents)日益将对话交互与自主任务执行(如编程、网络研究)相结合,这引发了一个自然问题:当从事长周期任务的智能体暴露于用户说服时会发生什么?然而,研究这种可能性颇具挑战性,因为持久运行的智能体行为噪声大、复现成本高,且尚不明确哪些独特挑战仅在长时任务执行中显现。我们研究信念层干预如何影响下游任务行为,并将此现象命名为说服传播(persuasion propagation)。我们引入一个以行为为中心的评估框架,该框架区分了任务执行期间施加的说服与任务执行前施加的说服。在网络研究和编程任务中,我们发现即时说服仅产生微弱且不一致的行为效应。相比之下,当在任务执行时明确指定信念状态时,与中性预设智能体相比,信念预设智能体平均搜索次数减少26.9%,访问的独立来源数量减少16.9%。这些结果表明,即使是在先前交互中施加的说服,也可能影响智能体的行为,从而激励在智能体系统中进行行为层面的评估。