While distributed training is often viewed as a solution to optimizing linear models on increasingly large datasets, inter-machine communication costs of popular distributed approaches can dominate as data dimensionality increases. Recent work on non-interactive algorithms shows that approximate solutions for linear models can be obtained efficiently with only a single round of communication among machines. However, this approximation often degenerates as the number of machines increases. In this paper, building on the recent optimal weighted average method, we introduce a new technique, ACOWA, that allows an extra round of communication to achieve noticeably better approximation quality with minor runtime increases. Results show that for sparse distributed logistic regression, ACOWA obtains solutions that are more faithful to the empirical risk minimizer and attain substantially higher accuracy than other distributed algorithms.
翻译:尽管分布式训练常被视为在日益增长的大规模数据集上优化线性模型的解决方案,但随着数据维度的增加,主流分布式方法的机器间通信开销可能占据主导地位。近期关于非交互式算法的研究表明,只需在机器间进行单轮通信即可高效获得线性模型的近似解。然而,这种近似解的质量常随着机器数量的增加而退化。本文基于近期提出的最优加权平均方法,引入了一种新技术——ACOWA,该技术通过增加一轮通信,能以较小的运行时开销显著提升近似质量。实验结果表明,对于稀疏分布式逻辑回归问题,ACOWA获得的解更贴近经验风险最小化器,并比其他分布式算法取得了显著更高的准确率。