Differentially Private Selection from Secure Distributed Computin

Given a collection of vectors $x^{(1)},\dots,x^{(n)} \in \{0,1\}^d$, the selection problem asks to report the index of an "approximately largest" entry in $x=\sum_{j=1}^n x^{(j)}$. Selection abstracts a host of problems--in machine learning it can be used for hyperparameter tuning, feature selection, or to model empirical risk minimization. We study selection under differential privacy, where a released index guarantees privacy for each vectors. Though selection can be solved with an excellent utility guarantee in the central model of differential privacy, the distributed setting lacks solutions. Specifically, strong privacy guarantees with high utility are offered in high trust settings, but not in low trust settings. For example, in the popular shuffle model of distributed differential privacy, there are strong lower bounds suggesting that the utility of the central model cannot be obtained. In this paper we design a protocol for differentially private selection in a trust setting similar to the shuffle model--with the crucial difference that our protocol tolerates corrupted servers while maintaining privacy. Our protocol uses techniques from secure multi-party computation (MPC) to implement a protocol that: (i) has utility on par with the best mechanisms in the central model, (ii) scales to large, distributed collections of high-dimensional vectors, and (iii) uses $k\geq 3$ servers that collaborate to compute the result, where the differential privacy holds assuming an honest majority. Since general-purpose MPC techniques are not sufficiently scalable, we propose a novel application of integer secret sharing, and evaluate the utility and efficiency of our protocol theoretically and empirically. Our protocol is the first to demonstrate that large-scale differentially private selection is possible in a distributed setting.

翻译：给定一组向量 $x^{(1)},\dots,x^{(n)} \in \{0,1\}^d$，选择问题要求报告 $x=\sum_{j=1}^n x^{(j)}$ 中“近似最大”条目的索引。选择问题抽象了一系列问题——在机器学习中，它可用于超参数调优、特征选择或对经验风险最小化进行建模。我们研究在差异隐私下的选择问题，其中发布的索引保证了每个向量的隐私。尽管在差异隐私的中心模型中可以以优异的效用保证解决选择问题，但分布式设置缺乏解决方案。具体来说，在高信任设置中提供了具有高效用的强隐私保证，但在低信任设置中则不然。例如，在流行的分布式差异隐私混洗模型中，存在强有力的下界表明无法获得中心模型的效用。在本文中，我们设计了一个在信任设置（类似于混洗模型）下进行差异隐私选择的协议——关键区别在于我们的协议容忍被破坏的服务器，同时保持隐私。我们的协议使用安全多方计算（MPC）技术来实现一个协议，该协议：(i) 具有与中心模型中最佳机制相当的效用，(ii) 可扩展到大规模、分布式的高维向量集合，以及 (iii) 使用 $k\geq 3$ 个协作计算结果的服务器，其中差异隐私在假设诚实多数的情况下成立。由于通用 MPC 技术不具备足够的可扩展性，我们提出了一种整数秘密共享的新颖应用，并从理论和经验上评估了我们协议的效用和效率。我们的协议是第一个证明在分布式设置中实现大规模差异隐私选择是可行的协议。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日