The introduction of the new multi-user linearly-separable distributed computing framework, has recently revealed how a parallel treatment of users can yield large parallelization gains with relatively low computation and communication costs. These gains stem from a new approach that converts the computing problem into a sparse matrix factorization problem; a matrix $F$ that describes the users' requests, is decomposed as \(F = DE\), where a \(γ\)-sparse \(E\) defines the task allocation across $N$ servers, and a \(δ\)-sparse \(D\) defines the connectivity between \(N\) servers and \(K\) users as well as the decoding process. While this approach provides near-optimal performance, its linear nature has raised data secrecy concerns. We here adopt an information-theoretic secrecy framework, seeking guarantees that each user can learn nothing more than its own requested function. In this context, our main result provides two necessary and sufficient secrecy criteria; (i) for each user \(k\) who observes $α_k$ server responses, the common randomness visible to that user must span a subspace of dimension exactly $α_k-1$, and (ii) for each user, removing from \(\mathbf{D}\) the columns corresponding to the servers it observes must leave a matrix of rank at least \(K-1\). With these conditions in place, we design a general scheme -- that applies to finite and non-finite fields alike -- which is based on appending to \(\mathbf{E}\) a basis of \(\mathrm{Null}(\mathbf{D})\) and by carefully injecting shared randomness. In many cases, this entails no additional costs. The scheme, while maintaining performance, guarantees perfect information-theoretic secrecy in the case of finite fields, while in the real case, the conditions yield an explicit mutual-information bound that can be made arbitrarily small by increasing the variance of Gaussian common randomness.
翻译:新型多用户线性可分离分布式计算框架的引入,最近揭示了通过并行处理用户如何能以相对较低的计算和通信成本获得显著的并行化增益。这些增益源于一种将计算问题转化为稀疏矩阵分解问题的新方法:描述用户请求的矩阵$F$被分解为\(F = DE\),其中\(γ\)-稀疏的\(E\)定义了任务在$N$个服务器间的分配,而\(δ\)-稀疏的\(D\)则定义了$N$个服务器与$K$个用户之间的连接关系以及解码过程。尽管该方法提供了接近最优的性能,但其线性特性引发了数据保密性的担忧。本文采用信息论保密框架,旨在保证每个用户除自身请求的函数外无法获知任何额外信息。在此背景下,我们的主要成果提出了两个必要且充分的保密准则:(i) 对于观察到$α_k$个服务器响应的每个用户$k$,该用户可见的公共随机性必须恰好张成一个维度为$α_k-1$的子空间;(ii) 对于每个用户,从矩阵\(\mathbf{D}\)中移除其观察到的服务器对应的列后,剩余矩阵的秩必须至少为\(K-1\)。基于这些条件,我们设计了一种通用方案——适用于有限域与非有限域——该方案通过在\(\mathbf{E}\)后附加\(\mathrm{Null}(\mathbf{D})\)的一组基,并精心注入共享随机性来实现。在许多情况下,这不会产生额外成本。该方案在保持性能的同时,对有限域情形可保证完善的信息论保密性;而在实数域情形下,所提条件可推导出显式的互信息上界,该上界可通过增大高斯公共随机性的方差而任意减小。