Large language model (LLM) agents are increasingly deployed in personalized tasks involving sensitive, context-dependent information, where privacy violations may arise in agents' action due to the implicitness of contextual privacy. Existing approaches rely on external, inference-time interventions which are brittle, scenario-specific, and may expand the privacy attack surface. We propose PrivAct, a contextual privacy-aware multi-agent learning framework that internalizes contextual privacy preservation directly into models' generation behavior for privacy-compliant agentic actions. By embedding privacy preferences into each agent, PrivAct enhances system-wide contextual integrity while achieving a more favorable privacy-helpfulness tradeoff. Experiments across multiple LLM backbones and benchmarks demonstrate consistent improvements in contextual privacy preservation, reducing leakage rates by up to 12.32% while maintaining comparable helpfulness, as well as zero-shot generalization and robustness across diverse multi-agent topologies. Code is available at https://github.com/chengyh23/PrivAct.
翻译:大型语言模型(LLM)智能体越来越多地部署于涉及敏感且依赖上下文信息的个性化任务中。由于上下文隐私的隐含性,智能体的行为可能导致隐私侵犯。现有方法依赖于外部、推理阶段的干预措施,这些措施脆弱、场景特定,并可能扩大隐私攻击面。我们提出PrivAct,一种上下文隐私感知的多智能体学习框架,它将上下文隐私保护直接内化至模型的生成行为中,以实现符合隐私规范的智能体行动。通过将隐私偏好嵌入每个智能体,PrivAct在实现更优的隐私-效用权衡的同时,增强了系统级的上下文完整性。在多种LLM骨干模型和基准测试上的实验表明,该方法在上下文隐私保护方面取得了持续改进,将泄露率降低了高达12.32%,同时保持了相当的效用水平,并在多样化的多智能体拓扑结构中展现出零样本泛化能力和鲁棒性。代码发布于https://github.com/chengyh23/PrivAct。