In this paper, we propose a practically efficient model for securely computing rank-based statistics, e.g., median, percentiles and quartiles, over distributed datasets in the malicious setting without leaking individual data privacy. Based on the binary search technique of Aggarwal et al. (EUROCRYPT \textquotesingle 04), we respectively present an interactive protocol and a non-interactive protocol, involving at most $\log ||R||$ rounds, where $||R||$ is the range size of the dataset elements. Besides, we introduce a series of optimisation techniques to reduce the round complexity. Our computing model is modular and can be instantiated with either homomorphic encryption or secret-sharing schemes. Compared to the state-of-the-art solutions, it provides stronger security and privacy while maintaining high efficiency and accuracy. Unlike differential-privacy-based solutions, it does not suffer a trade-off between accuracy and privacy. On the other hand, it only involves $O(N \log ||R||)$ time complexity, which is far more efficient than those bitwise-comparison-based solutions with $O(N^2\log ||R||)$ time complexity, where $N$ is the dataset size. Finally, we provide a UC-secure instantiation with the threshold Paillier cryptosystem and $\Sigma$-protocol zero-knowledge proofs of knowledge.
翻译:本文提出一种实用高效的安全计算模型,用于在恶意敌手模型下对分布式数据集的秩统计量(如中位数、百分位数和四分位数)进行安全计算,且不泄露个体数据隐私。基于Aggarwal等人(EUROCRYPT '04)的二分查找技术,我们分别提出交互式协议和非交互式协议,其轮数最多为$\log ||R||$轮,其中$||R||$表示数据集元素的值域大小。此外,我们引入一系列优化技术以降低轮复杂度。该计算模型具有模块化特性,可实例化为同态加密或秘密共享方案。与现有最优解决方案相比,该模型在保持高效率和精度的同时,提供了更强的安全性和隐私保护。与基于差分隐私的解决方案不同,本模型无需在精度与隐私之间权衡。另一方面,其时间复杂度仅为$O(N \log ||R||)$,远优于基于逐位比较方案$O(N^2\log ||R||)$的时间复杂度,其中$N$为数据集大小。最后,我们利用门限Paillier密码系统和$\Sigma$-协议零知识证明知识,给出一个UC安全实例化方案。