Cloud computing enables users to process and store data remotely on high-performance computers and servers by sharing data over the Internet. However, transferring data to clouds causes unavoidable privacy concerns. Here, we present a synthesis framework to design coding mechanisms that allow sharing and processing data in a privacy-preserving manner without sacrificing data utility and algorithmic performance. We consider the setup where the user aims to run an algorithm in the cloud using private data. The cloud then returns some data utility back to the user (utility refers to the service that the algorithm provides, e.g., classification, prediction, AI models, etc.). To avoid privacy concerns, the proposed scheme provides tools to co-design: 1) coding mechanisms to distort the original data and guarantee a prescribed differential privacy level; 2) an equivalent-but-different algorithm (referred here to as the target algorithm) that runs on distorted data and produces distorted utility; and 3) a decoding function that extracts the true utility from the distorted one with a negligible error. Then, instead of sharing the original data and algorithm with the cloud, only the distorted data and target algorithm are disclosed, thereby avoiding privacy concerns. The proposed scheme is built on the synergy of differential privacy and system immersion tools from control theory. The key underlying idea is to design a higher-dimensional target algorithm that embeds all trajectories of the original algorithm and works on randomly encoded data to produce randomly encoded utility. We show that the proposed scheme can be designed to offer any level of differential privacy without degrading the algorithm's utility. We present two use cases to illustrate the performance of the developed tools: privacy in optimization/learning algorithms and a nonlinear networked control system.
翻译:云计算通过互联网共享数据,使用户能够将数据远程处理并存储在高性能计算机和服务器上。然而,向云端传输数据会引发不可避免的隐私问题。本文提出一种综合框架,用于设计既能保障隐私共享与处理数据、又不牺牲数据效用和算法性能的编码机制。我们考虑用户希望在云端使用私有数据运行算法的场景,云端随后向用户返回某些数据效用(效用指算法提供的服务,例如分类、预测、人工智能模型等)。为避免隐私问题,所提出的方案提供了联合设计工具:1)用于扭曲原始数据并保证规定差分隐私水平的编码机制;2)在扭曲数据上运行并产生扭曲效用的等价异形算法(此处称为目标算法);3)从扭曲效用中提取真实效用且误差可忽略的解码函数。这样,用户只需向云端披露扭曲数据和目标算法,而非原始数据和算法,从而规避隐私问题。该方案基于差分隐私与控制理论中系统浸入方法的协同作用构建。其核心思想是设计一个更高维的目标算法,该算法嵌入原始算法的所有轨迹,并在随机编码数据上运行以产生随机编码效用。我们证明,该方案可在不降低算法效用的前提下实现任意水平的差分隐私保护。为展示所开发工具的性能,本文给出两个应用案例:优化/学习算法中的隐私保护以及非线性网络化控制系统。