Nonlinear aggregation is central to modern distributed systems, yet its privacy behavior is far less understood than that of linear aggregation. Unlike linear aggregation where mature mechanisms can often suppress information leakage, nonlinear operators impose inherent structural limits on what privacy guarantees are theoretically achievable when the aggregate must be computed exactly. This paper develops a unified information-theoretic framework to characterize privacy leakage in distributed nonlinear aggregation under a joint adversary that combines passive (honest-but-curious) corruption and eavesdropping over communication channels. We cover two broad classes of nonlinear aggregates: order-based operators (maximum/minimum and top-$K$) and robust aggregation (median/quantiles and trimmed mean). We first derive fundamental lower bounds on leakage that hold without sacrificing accuracy, thereby identifying the minimum unavoidable information revealed by the computation and the transcript. We then propose simple yet effective privacy-preserving distributed algorithms, and show that with appropriate randomized initialization and parameter choices, our proposed approaches can attach the derived optimal bounds for the considered operators. Extensive experiments validate the tightness of the bounds and demonstrate that network topology and key algorithmic parameters (including the stepsize) govern the observed leakage in line with the theoretical analysis, yielding actionable guidelines for privacy-preserving nonlinear aggregation.
翻译:非线性聚合是现代分布式系统的核心,然而其隐私行为远不如线性聚合那样被深入理解。在线性聚合中,成熟的机制通常能够抑制信息泄露,而非线性算子则对在必须精确计算聚合值时理论上可实现的隐私保证施加了固有的结构性限制。本文建立了一个统一的信息论框架,用于刻画在结合了被动(诚实但好奇)腐败与通信信道窃听的联合敌手模型下,分布式非线性聚合中的隐私泄露。我们涵盖了两大类非线性聚合:基于顺序的算子(最大值/最小值与前$K$个值)以及鲁棒性聚合(中位数/分位数与截尾均值)。我们首先推导了在不牺牲准确性的前提下成立的泄露基本下界,从而识别出计算过程与通信记录所揭示的最小不可避免信息。随后,我们提出了简单而有效的隐私保护分布式算法,并证明通过适当的随机初始化和参数选择,我们提出的方法能够达到针对所考虑算子的推导出的最优界限。大量实验验证了界限的紧致性,并表明网络拓扑和关键算法参数(包括步长)根据理论分析主导了观测到的泄露,从而为隐私保护的非线性聚合提供了可操作的指导原则。