The evolution from static ranking models to Agentic Recommender Systems (Agentic RecSys) empowers AI agents to maintain long-term user profiles and autonomously plan service tasks. While this paradigm shift enhances personalization, it introduces a vulnerability: reliance on Long-term Memory (LTM). In this paper, we uncover a threat termed "Visual Inception." Unlike traditional adversarial attacks that seek immediate misclassification, Visual Inception injects triggers into user-uploaded images (e.g., lifestyle photos) that act as "sleeper agents" within the system's memory. When retrieved during future planning, these poisoned memories hijack the agent's reasoning chain, steering it toward adversary-defined goals (e.g., promoting high-margin products) without prompt injection. To mitigate this, we propose CognitiveGuard, a dual-process defense framework inspired by human cognition. It consists of a System 1 Perceptual Sanitizer (diffusion-based purification) to cleanse sensory inputs and a System 2 Reasoning Verifier (counterfactual consistency checks) to detect anomalies in memory-driven planning. Extensive experiments on a mock e-commerce agent environment demonstrate that Visual Inception achieves about 85% Goal-Hit Rate (GHR), while CognitiveGuard reduces this risk to around 10% with configurable latency trade-offs (about 1.5s in lite mode to about 6.5s for full sequential verification), without quality degradation under our setup.
翻译:从静态排序模型到智能推荐系统(Agentic RecSys)的演进,赋予AI智能体维护长期用户画像并自主规划服务任务的能力。尽管这种范式转变增强了个性化能力,但也引入了一个漏洞:对长期记忆(LTM)的依赖。本文揭示了一种名为"视觉注入"(Visual Inception)的威胁。不同于追求即时错误分类的传统对抗攻击,视觉注入将触发器注入用户上传的图像(如日常生活照片)中,使其充当系统记忆中的"潜伏智能体"。当这些被污染的记忆在后续规划中被检索时,会劫持智能体的推理链,无需提示注入即可将其引导至攻击者预设的目标(如推广高利润商品)。为缓解这一问题,我们提出认知守护(CognitiveGuard)——一种受人类认知启发的双过程防御框架。该框架包含系统1感知净化器(基于扩散的净化机制)用于清洁感官输入,以及系统2推理验证器(反事实一致性检验)用于检测记忆驱动规划中的异常。在模拟电商智能体环境中的大量实验表明,视觉注入可实现约85%的目标命中率(GHR),而认知守护可将该风险降至约10%,并具有可配置的延迟权衡(精简模式下约1.5秒至完整顺序验证约6.5秒),且在我们的设置下不会造成质量下降。