We study the problem of computing a U-statistic with a kernel function f of degree k $\ge$ 2, i.e., the average of some function f over all k-tuples of instances, in a federated learning setting. Ustatistics of degree 2 include several useful statistics such as Kendall's $τ$ coefficient, the Area under the Receiver-Operator Curve and the Gini mean difference. Existing methods provide solutions only under the lower-utility local differential privacy model and/or scale poorly in the size of the domain discretization. In this work, we propose a protocol that securely computes U-statistics of degree k $\ge$ 2 under central differential privacy by leveraging Multi Party Computation (MPC). Our method substantially improves accuracy when compared to prior solutions. We provide a detailed theoretical analysis of its accuracy, communication and computational properties. We evaluate its performance empirically, obtaining favorable results, e.g., for Kendall's $τ$ coefficient, our approach reduces the Mean Squared Error by up to four orders of magnitude over existing baselines.
翻译:本研究探讨在联邦学习环境下计算核函数f为k阶(k $\ge$ 2)的U统计量问题,即计算函数f在所有k元实例组上取值的平均值。二阶U统计量涵盖多个重要统计指标,如Kendall $τ$系数、接收者操作特征曲线下面积以及基尼平均差。现有方法仅能在低效用的本地差分隐私模型下提供解决方案,且其计算效率随域离散化规模的扩大而急剧下降。本文提出一种基于多方计算(MPC)技术的协议,可在中心差分隐私框架下安全计算k $\ge$ 2阶的U统计量。相较于现有方案,本方法显著提升了计算精度。我们对其精度、通信开销与计算特性进行了系统的理论分析,并通过实证评估验证了其优越性能:以Kendall $τ$系数为例,本方法将均方误差较现有基线降低了多达四个数量级。