We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. The proposed mechanism integrates the concepts of randomized sketching and polar codes in the context of coded computation. We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery. Additionally, we provide an anytime estimator that can generate provably accurate estimates even when the set of available node outputs is not decodable. We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization. We present the implementation of these methods on a serverless cloud computing system and provide numerical results to demonstrate their scalability in practice, including ImageNet scale computations.
翻译:我们提出了一种新颖的分布式计算框架,该框架对慢速计算节点具有鲁棒性,并能实现线性操作的近似精确计算。所提出的机制将随机摘要与极化码的概念融合于编码计算的语境中。我们设计了一种针对实值数据的序贯解码算法,在保持较低计算复杂度的同时实现数据恢复。此外,我们提供了一种即时估计器,即使可用节点输出集合无法解码,也能生成可证明精度的估计。我们展示了该框架在大规模矩阵乘法与黑箱优化等多种场景下的潜在应用。通过在无服务器云计算系统上实现这些方法,并给出包括ImageNet规模计算在内的数值结果,验证了其实际可扩展性。