DPAgent-in-the-Middle: Agentic Defense and Repair Against AI-Groomed Deceptive Patterns

Privacy deceptive patterns in web interfaces systematically manipulate users into disclosing personal data, yet existing defenses are fragmented, static, and increasingly vulnerable to manipulation by large language models. Moreover, data voids, areas of information scarcity within the web ecosystem, create fertile ground for adversaries to inject misleading content that can be scraped and learned by AI systems, thereby amplifying both deceptive design and model misbehavior. In this paper, we formalize a new threat model, AI grooming, where attackers exploit data voids to seed benign-looking but malicious samples that corrupt model reasoning and normalize deceptive practices. To address this threat in privacy deceptive patterns, we present DPAgent, an agentic and reasoning-aware framework that orchestrates four specialized agents to mitigate the AI Grooming threat via a proactive defense that combines latent space purification with defensive prompting and operates directly in live web environments to proactively explore, detect, and repair privacy deceptive user interfaces before they reach end users. Extensive evaluations show that DPAgent detects 90.98% of groomed samples, achieves state-of-the-art privacy deceptive pattern detection with a micro F1 of 0.816, explores over 80% of pattern types while visiting only about 10% of the pages required by baselines, and successfully repairs 77% of detected deceptive interfaces. A large-scale study of 485 websites in the wild reveals that up to 98% contain at least one privacy deceptive pattern, over 90% of which can be mitigated by DPAgent. User studies further confirm that DPAgent effectively reduces privacy risks while preserving browsing experience. Our results demonstrate the promise of agent-in-the-middle defenses for securing the web UI supply chain against deceptive design and emerging AI threats rooted in data void exploitation.

翻译：网页界面中的隐私欺骗模式会系统性地操纵用户披露个人数据，然而现有防御手段存在碎片化、静态化且易受大型语言模型操纵的缺陷。此外，网络生态系统中的信息稀缺区域——数据真空带，为攻击者注入可被AI系统抓取学习的误导性内容提供了沃土，从而放大了欺骗性设计与模型行为异常。本文正式定义了一种新型威胁模型——AI诱导攻击（AI grooming），攻击者利用数据真空带植入看似无害但实际恶意的样本，这些样本会损害模型推理能力并使欺骗行为常态化。为应对隐私欺骗模式中的这一威胁，我们提出DPAgent——一种基于自主代理且感知推理能力的框架，该框架协调四个专用代理，通过结合潜在空间净化与防御性提示的主动防御机制，直接在实时网络环境中运行以主动探索、检测并修复隐私欺骗性用户界面，使其在到达最终用户前即被消除。大量评估表明，DPAgent可检测出90.98%的诱导样本，在隐私欺骗模式检测中达到最优水平（微平均F1值为0.816），仅需访问基线方法约10%的页面即可探索超过80%的模式类型，并成功修复77%的已检测欺骗界面。针对485个真实网站的大规模研究发现，高达98%的网站包含至少一种隐私欺骗模式，其中超过90%可通过DPAgent得到缓解。用户研究进一步证实，DPAgent在有效降低隐私风险的同时保持了浏览体验。我们的研究结果证明了代理中间防御在保护Web UI供应链免受欺骗性设计及基于数据真空带利用的新兴AI威胁方面的潜力。

相关内容

关注 7111

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

《人工智能在网络防御中的机遇》

专知会员服务

12+阅读 · 6月8日

DGP双粒度提示框架：图增强大模型助力欺诈检测

专知会员服务

9+阅读 · 2025年8月17日

中文版 | 数字战场：人工智能如何作为主动防护盾对抗网络欺凌

专知会员服务

10+阅读 · 2025年5月22日

【新书】利用生成式人工智能进行网络防御策略

专知会员服务

33+阅读 · 2024年10月18日