A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

Environments built for people are increasingly operated by a new class of economic actors: LLM-powered software agents making decisions on our behalf. These decisions range from our purchases to travel plans to medical treatment selection. Current evaluations of these agents largely focus on task competence, but we argue for a deeper assessment: how these agents choose when faced with realistic decisions. We introduce ABxLab, a framework for systematically probing agentic choice through controlled manipulations of option attributes and persuasive cues. We apply this to a realistic web-based shopping environment, where we vary prices, ratings, and psychological nudges, all of which are factors long known to shape human choice. We find that agent decisions shift predictably and substantially in response, revealing that agents are strongly biased choosers even without being subject to the cognitive constraints that shape human biases. This susceptibility reveals both risk and opportunity: risk, because agentic consumers may inherit and amplify human biases; opportunity, because consumer choice provides a powerful testbed for a behavioral science of AI agents, just as it has for the study of human behavior. We release our framework as an open benchmark for rigorous, scalable evaluation of agent decision-making.

翻译：为人类构建的环境正日益由一类新型经济主体运营：基于LLM的软件代理代表我们做出决策。这些决策涵盖从购物到旅行计划再到医疗方案选择等广泛领域。当前对这些代理的评估主要关注任务完成能力，但我们主张进行更深入的评估：这些代理在面对现实决策时如何做出选择。我们提出了ABxLab框架，通过控制选项属性和说服性线索的系统性操纵来探究代理选择行为。我们将该框架应用于基于网络的真实购物环境中，在其中调整价格、评分和心理助推因素——这些都是长期已知影响人类选择的因素。研究发现，代理决策会随之发生可预测且显著的改变，表明即使不受塑造人类偏见的认知约束影响，代理仍是存在强烈偏见的决策者。这种易感性既揭示了风险也展现了机遇：风险在于代理消费者可能继承并放大人类偏见；机遇在于消费者选择为AI代理行为科学提供了强大的测试平台，正如其长期以来对人类行为研究的作用。我们将该框架作为开放基准发布，用于对代理决策进行严谨、可扩展的评估。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

AgentOps综述：智能体系统运维框架

专知会员服务

19+阅读 · 6月4日

智能体评判者（Agent-as-a-Judge）研究综述

专知会员服务

37+阅读 · 1月9日