Inference centers need more data to have a more comprehensive and beneficial learning model, and for this purpose, they need to collect data from data providers. On the other hand, data providers are cautious about delivering their datasets to inference centers in terms of privacy considerations. In this paper, by modifying the structure of the autoencoder, we present a method that manages the utility-privacy trade-off well. To be more precise, the data is first compressed using the encoder, then confidential and non-confidential features are separated and uncorrelated using the classifier. The confidential feature is appropriately combined with noise, and the non-confidential feature is enhanced, and at the end, data with the original data format is produced by the decoder. The proposed architecture also allows data providers to set the level of privacy required for confidential features. The proposed method has been examined for both image and categorical databases, and the results show a significant performance improvement compared to previous methods.
翻译:推理中心需要更多数据以获得更全面且有益的学习模型,为此需从数据提供者处收集数据。然而,数据提供者出于隐私考量,在向推理中心交付数据集时持谨慎态度。本文通过修改自编码器结构,提出了一种能有效平衡效用与隐私权衡的方法。具体而言,数据首先通过编码器进行压缩,随后利用分类器分离机密特征与非机密特征并消除其相关性。机密特征与噪声进行适当融合,非机密特征则被增强,最终由解码器生成原始格式的数据。所提出的架构还允许数据提供者设置机密特征所需的隐私等级。该方法已在图像数据库和分类数据库上进行验证,结果表明其性能较以往方法有显著提升。