Web Agents are increasingly deployed to perform complex tasks in real web environments, yet their security evaluation remains fragmented and difficult to standardize. We present WebTrap Park, an automated platform for systematic security evaluation of Web Agents through direct observation of their concrete interactions with live web pages. WebTrap Park instantiates three major sources of security risk into 1,226 executable evaluation tasks and enables action based assessment without requiring agent modification. Our results reveal clear security differences across agent frameworks, highlighting the importance of agent architecture beyond the underlying model. WebTrap Park is publicly accessible at https://security.fudan.edu.cn/webagent and provides a scalable foundation for reproducible Web Agent security evaluation.
翻译:Web智能体正越来越多地被部署于真实网络环境中执行复杂任务,但其安全评估工作仍处于碎片化状态,难以标准化。本文提出WebTrap Park,这是一个通过直接观察Web智能体与真实网页具体交互行为,从而对其进行系统性安全评估的自动化平台。WebTrap Park将三类主要安全风险来源实例化为1,226个可执行评估任务,并支持无需修改智能体本身的行为式评估。我们的实验结果揭示了不同智能体框架间显著的安全性能差异,凸显了除底层模型外智能体架构本身的重要性。WebTrap Park已在 https://security.fudan.edu.cn/webagent 公开访问,为可复现的Web智能体安全评估提供了一个可扩展的基础平台。