Autonomous web agents such as \textbf{OpenClaw} are rapidly moving into high-impact real-world workflows, but their security robustness under live network threats remains insufficiently evaluated. Existing benchmarks mainly focus on static sandbox settings and content-level prompt attacks, which leaves a practical gap for network-layer security testing. In this paper, we present \textbf{ClawTrap}, a \textbf{MITM-based red-teaming framework for real-world OpenClaw security evaluation}. ClawTrap supports diverse and customizable attack forms, including \textit{Static HTML Replacement}, \textit{Iframe Popup Injection}, and \textit{Dynamic Content Modification}, and provides a reproducible pipeline for rule-driven interception, transformation, and auditing. This design lays the foundation for future research to construct richer, customizable MITM attacks and to perform systematic security testing across agent frameworks and model backbones. Our empirical study shows clear model stratification: weaker models are more likely to trust tampered observations and produce unsafe outputs, while stronger models demonstrate better anomaly attribution and safer fallback strategies. These findings indicate that reliable OpenClaw security evaluation should explicitly incorporate dynamic real-world MITM conditions rather than relying only on static sandbox protocols.
翻译:自主网络代理(如\textbf{OpenClaw})正快速进入高影响力的真实世界工作流程,但其在实时网络威胁下的安全鲁棒性仍未得到充分评估。现有基准主要关注静态沙箱设置和内容层面的提示攻击,这导致网络层安全测试存在实际缺口。本文提出\textbf{ClawTrap}——一个\textbf{基于中间人攻击(MITM)的红队框架,用于真实世界的OpenClaw安全评估}。ClawTrap支持多样化的可定制攻击形式,包括\textit{静态HTML替换}、\textit{iframe弹窗注入}和\textit{动态内容修改},并提供可复现的规则驱动拦截、转换与审计流水线。该设计为未来研究构建更丰富、可定制的MITM攻击,并跨代理框架与模型主干进行系统性安全测试奠定了基础。我们的实证研究表明模型存在明显分层:弱模型更易信任篡改后的观测结果并产生不安全输出,而强模型则展现出更优的异常归因能力和更安全的回退策略。这些发现表明,可靠的OpenClaw安全评估应明确纳入动态真实世界MITM条件,而非仅依赖静态沙箱协议。