Firms and statistical agencies must protect the privacy of the individuals whose data they collect, analyze, and publish. Increasingly, these organizations do so by using publication mechanisms that satisfy differential privacy. We consider the problem of choosing such a mechanism so as to maximize the value of its output to end users. We show that mechanisms which add noise to the statistic of interest--like most of those used in practice--are generally not optimal when the statistic is a sum or average of magnitude data (e.g., income). However, we also show that adding noise is always optimal when the statistic is a count of data entries with a certain characteristic, and the underlying database is drawn from a symmetric distribution (e.g., if individuals' data are i.i.d.). When, in addition, data users have supermodular payoffs, we show that the simple geometric mechanism is always optimal by using a novel comparative static that ranks information structures according to their usefulness in supermodular decision problems.
翻译:企业和统计机构必须保护其收集、分析和发布数据所涉及个体的隐私。这些组织越来越多地通过采用满足差分隐私的发布机制来实现这一目标。本文研究如何选择此类机制,以最大化其输出对终端用户的价值。我们证明,当统计量为幅度数据(如收入)的总和或平均值时,对目标统计量添加噪声的机制——类似于实践中使用的大多数机制——通常并非最优。然而,我们也证明,当统计量是具有特定特征的数据条目计数,且底层数据库来自对称分布(例如,若个体数据独立同分布)时,添加噪声总是最优的。此外,当数据用户具有超模收益时,我们通过一种新颖的比较静态方法——该方法根据信息结构在超模决策问题中的有用性对其进行排序——证明简单的几何机制总是最优的。