Deploying data- and computation-intensive applications such as large-scale AI into heterogeneous dispersed computing networks can significantly enhance application performance by mitigating bottlenecks caused by limited network resources, including bandwidth, storage, and computing power. However, current resource allocation methods in dispersed computing do not provide a comprehensive solution that considers arbitrary topology, elastic resource amount, reuse of computation results, and nonlinear congestion-dependent optimization objectives. In this paper, we propose LOAM, a low-latency joint communication, caching, and computation placement framework with a rigorous analytical foundation that incorporates the above aspects. We tackle the NP-hard aggregated cost minimization problem with two methods: an offline method with a 1/2 approximation and an online adaptive method with a bounded gap from the optimum. Through extensive simulation, the proposed framework outperforms multiple baselines in both synthesis and real-world network scenarios.
翻译:将大规模AI等数据与计算密集型应用部署到异构分散计算网络中,可通过缓解网络资源(包括带宽、存储和计算能力)有限导致的瓶颈,显著提升应用性能。然而,当前分散计算中的资源分配方法未能提供综合考虑任意拓扑结构、弹性资源规模、计算结果复用以及非线性拥塞依赖优化目标的整体解决方案。本文提出LOAM——一种具有严格分析基础的低延迟联合通信、缓存与计算放置框架,该框架融合了上述各方面。我们采用两种方法解决NP难的聚合成本最小化问题:具有1/2近似比的离线方法,以及具有与最优解有界差距的在线自适应方法。通过大量仿真,该框架在合成网络和真实网络场景中的表现均优于多种基线方法。