We propose a novel problem formulation to address the privacy-utility tradeoff, specifically when dealing with two distinct user groups characterized by unique sets of private and utility attributes. Unlike previous studies that primarily focus on scenarios where all users share identical private and utility attributes and often rely on auxiliary datasets or manual annotations, we introduce a collaborative data-sharing mechanism between two user groups through a trusted third party. This third party uses adversarial privacy techniques with our proposed data-sharing mechanism to internally sanitize data for both groups and eliminates the need for manual annotation or auxiliary datasets. Our methodology ensures that private attributes cannot be accurately inferred while enabling highly accurate predictions of utility features. Importantly, even if analysts or adversaries possess auxiliary datasets containing raw data, they are unable to accurately deduce private features. Additionally, our data-sharing mechanism is compatible with various existing adversarially trained privacy techniques. We empirically demonstrate the effectiveness of our approach using synthetic and real-world datasets, showcasing its ability to balance the conflicting goals of privacy and utility.
翻译:我们提出了一种新颖的问题框架,用于解决隐私-效用权衡问题,特别是针对两组具有不同隐私属性和效用属性的用户群体。与以往主要关注所有用户共享相同隐私和效用属性、且常依赖辅助数据集或人工标注的研究不同,我们通过一个可信第三方引入两组用户之间的协作数据共享机制。该第三方利用我们提出的数据共享机制,采用对抗性隐私技术内部净化两组数据,从而消除了人工标注或辅助数据集的需求。我们的方法确保隐私属性无法被准确推断,同时实现效用特征的高精度预测。值得注意的是,即使分析者或攻击者拥有包含原始数据的辅助数据集,也无法准确推导出隐私特征。此外,我们的数据共享机制兼容多种现有的对抗训练隐私技术。我们通过合成数据集和真实数据集实证展示了该方法在平衡隐私与效用冲突目标方面的有效性。