Consider a pair of random variables $(X,Y)$ distributed according to a given joint distribution $p_{XY}$. A curator wishes to maximally disclose information about $Y$, while limiting the information leakage incurred on $X$. Adopting mutual information to measure both utility and privacy of this information disclosure, the problem is to maximize $I(Y;U)$, subject to $I(X;U)\leq\epsilon$, where $U$ denotes the released random variable and $\epsilon$ is a given privacy threshold. Two settings are considered, where in the first one, the curator has access to $(X,Y)$, and hence, the optimization is over $p_{U|XY}$, while in the second one, the curator can only observe $Y$ and the optimization is over $p_{U|Y}$. In both settings, the utility-privacy trade-off is investigated from theoretical and practical perspective. More specifically, several privacy-preserving schemes are proposed in these settings based on generalizing the notion of statistical independence. Moreover, closed-form solutions are provided in certain scenarios. Finally, convexity arguments are provided for the utility-privacy trade-off as functionals of the joint distribution $p_{XY}$.
翻译:考虑一对随机变量$(X,Y)$,其分布服从给定的联合分布$p_{XY}$。策展人希望在限制$X$的信息泄露量的同时,最大化披露关于$Y$的信息。采用互信息作为该信息披露中效用与隐私的度量标准,问题转化为在满足$I(X;U)\leq\epsilon$的条件下最大化$I(Y;U)$,其中$U$表示被释放的随机变量,$\epsilon$为给定的隐私阈值。本研究考虑两种设定:第一种设定中,策展人可访问$(X,Y)$,因此优化对象为$p_{U|XY}$;第二种设定中,策展人仅能观测$Y$,优化对象为$p_{U|Y}$。在两种设定下,从理论与应用角度研究了效用-隐私权衡问题。具体而言,基于统计独立性概念的推广,提出了若干隐私保护方案。此外,在特定场景下给出了闭式解。最后,以联合分布$p_{XY}$的泛函形式,提供了效用-隐私权衡的凸性论证。