Programmatically generating tight differential privacy (DP) bounds is a hard problem. Two core challenges are (1) finding expressive, compact, and efficient encodings of the distributions of DP algorithms, and (2) state space explosion stemming from the multiple quantifiers and relational properties of the DP definition. We address the first challenge by developing a method for tight privacy and accuracy bound synthesis using weighted model counting on binary decision diagrams, a state of the art technique from the artificial intelligence and automated reasoning communities for exactly computing probability distributions. We address the second challenge by developing a framework for leveraging inherent symmetries in DP algorithms. Our solution benefits from ongoing research in probabilistic programming languages, allowing us to succinctly and expressively represent different DP algorithms with approachable language syntax that can be used by non-experts. We provide a detailed case study of our solution on the binary randomized response algorithm. We also evaluate an implementation of our solution using the Dice probabilistic programming language for the randomized response and truncated geometric above threshold algorithms. We compare to prior work on exact DP verification using Markov chain probabilistic model checking. Very few existing works consider mechanized analysis of accuracy guarantees for DP algorithms. We additionally provide a detailed analysis using our technique for finding tight accuracy bounds for DP algorithms.
翻译:程序化地生成紧致的差分隐私(DP)界限是一项难题。两个核心挑战为:(1)为DP算法的分布找到具有表达力、紧凑且高效的编码方式;(2)由DP定义中的多个量词和关系属性导致的状态空间爆炸问题。针对第一个挑战,我们开发了一种方法,利用人工智能与自动化推理领域的前沿技术——基于二元决策图的加权模型计数,精确计算概率分布,从而实现紧致的隐私与精度界限综合。针对第二个挑战,我们构建了一个框架,通过利用DP算法中固有的对称性予以解决。我们的解决方案受益于概率编程语言的持续研究,能够以简洁且富有表达力的语法表示不同的DP算法,使非专家用户也能轻松使用。我们以二元随机响应算法为例,详细展示了该方案的案例研究。此外,我们使用Dice概率编程语言对随机响应算法和截断几何超越阈值算法进行了实现评估,并与基于马尔可夫链概率模型检验的精确DP验证工作进行了比较。现有研究中极少涉及DP算法精度保证的机械化分析,我们还利用该方法对DP算法的紧致精度界限进行了详细分析。