Observable Channels, Not Just Storage: Evaluating Privacy Leakage in LLM Agent Pipelines

Privacy leakage in LLM agents is often studied through individual storage or execution components, such as memory modules, retrieval pipelines, or tool-mediated artifacts. However, these settings are typically analyzed in isolation, making it difficult to compare how private internal dependence becomes externally recoverable across heterogeneous agent pipelines. In this paper, we present CIPL (Channel Inversion for Privacy Leakage) as a unified channel-oriented measurement interface for evaluating privacy leakage in LLM agent pipelines. Rather than claiming a universally strongest attack recipe, CIPL provides a shared way to represent a target through its sensitive source, selection, assembly, execution, observation, and extraction stages, and to measure how internal exposure is transformed into attacker-recoverable leakage under a common protocol. Using memory-based, retrieval-mediated, and tool-mediated instantiations under this shared interface, we identify a distinct cross-target risk picture. Memory behaves as a near-saturated high-risk special case, while beyond-memory leakage exhibits a different regime: retrieval-mediated targets show frequent but often incomplete leakage, and tool-mediated targets are strongly shaped by the exposed observation surface and provider behavior. We further show that leakage is governed by channel conditions rather than by a universally dominant recipe: cleaned weak controls sharply suppress leakage, and semantic annotation reveals attacker-useful leakage beyond exact-match extraction. Together, these findings suggest that privacy risk in LLM agent pipelines is better understood through \emph{observable channels}, not just storage components. More broadly, our results motivate channel-oriented privacy evaluation as a necessary complement to component-local or exact-only analyses.

翻译：大语言模型代理中的隐私泄露通常通过个体存储或执行组件（例如记忆模块、检索流水线或工具中介产物）进行研究。然而，这些设置通常被孤立分析，使得难以比较不同异构代理流水线中私有内部依赖关系如何外部可恢复。本文提出CIPL（面向隐私泄露的信道逆映射）作为统一的面向信道的测量接口，用于评估大语言模型代理流水线中的隐私泄露。CIPL并非声称存在普遍最强的攻击配方，而是提供一种共享方式，通过目标实体的敏感源、选取、组装、执行、观测与提取阶段来表征该目标，并在统一协议下衡量内部暴露如何转化为攻击者可恢复的泄露信息。通过在该共享接口下实现基于记忆、检索中介和工具中介的实例化，我们识别出独特的跨目标风险图景。记忆表现为近乎饱和的高风险特例，而超记忆泄露则呈现不同形态：检索中介目标表现出频繁但常不完整的泄露，工具中介目标则受暴露观测表面及提供者行为的强烈塑造。我们进一步证明，泄露受信道条件而非普遍主导配方控制：经过清洗的弱控制可显著抑制泄露，语义标注揭示了超越精确匹配提取的攻击者有用泄露。综合来看，这些发现表明，大语言模型代理流水线中的隐私风险应通过可观测信道而非仅仅存储组件来更好理解。更广泛而言，我们的结果激励将面向信道的隐私评估作为组件局部或仅精确性分析的必要补充。