Programmatically generating tight differential privacy (DP) bounds is a hard problem. Two core challenges are (1) finding expressive, compact, and efficient encodings of the distributions of DP algorithms, and (2) state space explosion stemming from the multiple quantifiers and relational properties of the DP definition. We address the first challenge by developing a method for tight privacy and accuracy bound synthesis using weighted model counting on binary decision diagrams, a state of the art technique from the artificial intelligence and automated reasoning communities for exactly computing probability distributions. We address the second challenge by developing a framework for leveraging inherent symmetries in DP algorithms. Our solution benefits from ongoing research in probabilistic programming languages, allowing us to succinctly and expressively represent different DP algorithms with approachable language syntax that can be used by non-experts. We provide a detailed case study of our solution on the binary randomized response algorithm. We also evaluate an implementation of our solution using the Dice probabilistic programming language for the randomized response and truncated geometric above threshold algorithms. We compare to prior work on exact DP verification using Markov chain probabilistic model checking. Very few existing works consider mechanized analysis of accuracy guarantees for DP algorithms. We additionally provide a detailed analysis using our technique for finding tight accuracy bounds for DP algorithms.
翻译:程序化地生成严格的差分隐私(DP)界限是一个难题。两个核心挑战是:(1)为DP算法的分布找到表达力强、紧凑且高效的编码方式;(2)由DP定义中的多个量词和关系属性引发的状态空间爆炸问题。我们通过利用人工智能和自动推理领域最先进的加权模型计数技术——在二元决策图上精确计算概率分布——来解决第一个挑战,开发了一种用于严格隐私与精度界限合成的方法。我们通过构建一套利用DP算法内在对称性的框架来应对第二个挑战。我们的解决方案受益于概率编程语言的持续研究,使我们能够以非专家也能使用的、通俗易懂的语言语法,简洁且富有表现力地表示不同的DP算法。我们以二元随机响应算法为例,对我们的解决方案进行了详细案例研究。我们还使用Dice概率编程语言对随机响应算法和截断几何高于阈值算法实现了我们的解决方案,并进行了评估。我们将之与使用马尔可夫链概率模型检查进行精确DP验证的先前工作进行了比较。现有工作中极少有对DP算法的精度保证进行机械化分析的。此外,我们利用我们的技术对寻找DP算法的严格精度界限进行了详细分析。