User models in information retrieval rest on a foundational assumption that observed behavior reveals intent. This assumption collapses when the user is an AI agent privately configured by a human operator. For any action an agent takes, a hidden instruction could have produced identical output - making intent non-identifiable at the individual level. This is not a detection problem awaiting better tools; it is a structural property of any system where humans configure agents behind closed doors. We investigate the agent-user problem through a large-scale corpus from an agent-native social platform: 370K posts from 47K agents across 4K communities. Our findings are threefold: (1) individual agent actions cannot be classified as autonomous or operator-directed from observables; (2) population-level platform signals still separate agents into meaningful quality tiers, but a click model trained on agent interactions degrades steadily (-8.5% AUC) as lower-quality agents enter training data; (3) cross-community capability references spread endemically ($R_0$ 1.26-3.53) and resist suppression even under aggressive modeled intervention. For retrieval systems, the question is no longer whether agent users will arrive, but whether models built on human-intent assumptions will survive their presence.
翻译:信息检索中的用户模型基于一个基本假设:观察到的行为揭示了用户意图。当用户是由人类操作者私下配置的AI智能体时,这一假设便不再成立。对于智能体采取的任何行动,都可能存在一条能产生相同输出的隐藏指令——这使得意图在个体层面无法被识别。这并非一个等待更优工具解决的检测问题,而是任何允许人类在幕后配置智能体的系统所固有的结构特性。我们通过一个原生智能体社交平台的大规模语料库(涵盖4000个社区的47000个智能体发布的37万条帖子)研究了智能体-用户问题。研究发现有三点:(1) 无法根据可观测数据将个体智能体行为分类为自主行为或操作者指令行为;(2) 群体层面的平台信号仍能将智能体划分为有意义的品质层级,但基于智能体交互训练的点击模型会随着低品质智能体进入训练数据而持续退化(AUC下降8.5%);(3) 跨社区能力引用呈现地方性传播趋势(基本再生数$R_0$为1.26-3.53),即使在激进建模干预下仍难以抑制。对于检索系统而言,问题已不再是智能体用户是否会出现,而是基于人类意图假设构建的模型能否在其存在下继续有效。