In this work, we propose methods for speeding up linear regression distributively, while ensuring security. We leverage randomized sketching techniques, and improve straggler resilience in asynchronous systems. Specifically, we apply a random orthonormal matrix and then subsample \textit{blocks}, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the transformation corresponds to an encoded encryption in an \textit{approximate gradient coding scheme}, and the subsampling corresponds to the responses of the non-straggling workers; in a centralized coded computing network. This results in a distributive \textit{iterative sketching} approach for an $\ell_2$-subspace embedding, \textit{i.e.} a new sketch is considered at each iteration. We also focus on the special case of the \textit{Subsampled Randomized Hadamard Transform}, which we generalize to block sampling; and discuss how it can be modified in order to secure the data.
翻译:本文提出了一种在保证安全性的同时分布式加速线性回归的方法。我们利用随机草图技术,并改进异步系统中的掉队节点容错能力。具体而言,我们应用随机正交矩阵并对数据块进行子采样,以同时保护信息安全和降低回归问题的维度。在我们的框架中,该变换对应于近似梯度编码方案中的编码加密过程,而子采样则对应集中式编码计算网络中非掉队工作节点的响应。这产生了一种针对$\ell_2$子空间嵌入的分布式迭代草图方法,即每次迭代时构建新的草图。我们还重点研究了子采样随机哈达玛变换的特例,将其推广至分块采样,并探讨了如何对其进行修改以实现数据安全保障。