Privacy deceptive patterns in web interfaces systematically manipulate users into disclosing personal data, yet existing defenses are fragmented, static, and increasingly vulnerable to manipulation by large language models. Moreover, data voids, areas of information scarcity within the web ecosystem, create fertile ground for adversaries to inject misleading content that can be scraped and learned by AI systems, thereby amplifying both deceptive design and model misbehavior. In this paper, we formalize a new threat model, AI grooming, where attackers exploit data voids to seed benign-looking but malicious samples that corrupt model reasoning and normalize deceptive practices. To address this threat in privacy deceptive patterns, we present DPAgent, an agentic and reasoning-aware framework that orchestrates four specialized agents to mitigate the AI Grooming threat via a proactive defense that combines latent space purification with defensive prompting and operates directly in live web environments to proactively explore, detect, and repair privacy deceptive user interfaces before they reach end users. Extensive evaluations show that DPAgent detects 90.98% of groomed samples, achieves state-of-the-art privacy deceptive pattern detection with a micro F1 of 0.816, explores over 80% of pattern types while visiting only about 10% of the pages required by baselines, and successfully repairs 77% of detected deceptive interfaces. A large-scale study of 485 websites in the wild reveals that up to 98% contain at least one privacy deceptive pattern, over 90% of which can be mitigated by DPAgent. User studies further confirm that DPAgent effectively reduces privacy risks while preserving browsing experience. Our results demonstrate the promise of agent-in-the-middle defenses for securing the web UI supply chain against deceptive design and emerging AI threats rooted in data void exploitation.
翻译:网页界面中的隐私欺骗模式会系统性地操纵用户披露个人数据,然而现有防御手段存在碎片化、静态化且易受大型语言模型操纵的缺陷。此外,网络生态系统中的信息稀缺区域——数据真空带,为攻击者注入可被AI系统抓取学习的误导性内容提供了沃土,从而放大了欺骗性设计与模型行为异常。本文正式定义了一种新型威胁模型——AI诱导攻击(AI grooming),攻击者利用数据真空带植入看似无害但实际恶意的样本,这些样本会损害模型推理能力并使欺骗行为常态化。为应对隐私欺骗模式中的这一威胁,我们提出DPAgent——一种基于自主代理且感知推理能力的框架,该框架协调四个专用代理,通过结合潜在空间净化与防御性提示的主动防御机制,直接在实时网络环境中运行以主动探索、检测并修复隐私欺骗性用户界面,使其在到达最终用户前即被消除。大量评估表明,DPAgent可检测出90.98%的诱导样本,在隐私欺骗模式检测中达到最优水平(微平均F1值为0.816),仅需访问基线方法约10%的页面即可探索超过80%的模式类型,并成功修复77%的已检测欺骗界面。针对485个真实网站的大规模研究发现,高达98%的网站包含至少一种隐私欺骗模式,其中超过90%可通过DPAgent得到缓解。用户研究进一步证实,DPAgent在有效降低隐私风险的同时保持了浏览体验。我们的研究结果证明了代理中间防御在保护Web UI供应链免受欺骗性设计及基于数据真空带利用的新兴AI威胁方面的潜力。