Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance. In this work, we propose methods for speeding up distributed linear regression. We do so by leveraging randomized techniques, while also ensuring security and straggler resiliency in asynchronous distributed computing systems. Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the basis rotation corresponds to an encoded encryption in an approximate gradient coding scheme, and the subsampling corresponds to the responses of the non-straggling servers in the centralized coded computing framework. This results in a distributive iterative stochastic approach for matrix compression and steepest descent.
翻译:线性回归是监督机器学习中的基础性问题,广泛应用于流行病学、金融等领域。本研究提出加速分布式线性回归的方法,通过利用随机化技术,同时确保异步分布式计算系统的安全性与掉队节点容错性。具体而言,我们对方程组的基础进行随机旋转并子采样块,以同时实现信息加密与回归问题维度压缩。在我们的框架中,基础旋转对应于近似梯度编码方案中的编码加密,而子采样则对应于集中式编码计算框架中非掉队服务器的响应。由此形成一种用于矩阵压缩与最速下降的分布式迭代随机方法。