Advances in networks, accelerators, and cloud services encourage programmers to reconsider where to compute -- such as when fast networks make it cost-effective to compute on remote accelerators despite added latency. Workflow and cloud-hosted serverless computing frameworks can manage multi-step computations spanning federated collections of cloud, high-performance computing (HPC), and edge systems, but passing data among computational steps via cloud storage can incur high costs. Here, we overcome this obstacle with a new programming paradigm that decouples control flow from data flow by extending the pass-by-reference model to distributed applications. We describe ProxyStore, a system that implements this paradigm by providing object proxies that act as wide-area object references with just-in-time resolution. This proxy model enables data producers to communicate data unilaterally, transparently, and efficiently to both local and remote consumers. We demonstrate the benefits of this model with synthetic benchmarks and real-world scientific applications, running across various computing platforms.
翻译:网络、加速器及云服务的进步促使编程人员重新思考计算位置——例如,当高速网络使得远程加速器上的计算成本效益显现时,即使存在额外延迟。工作流与云托管无服务器计算框架可管理跨越云端、高性能计算(HPC)及边缘系统联邦集合的多步计算,但通过云端存储传递步骤间的数据会带来高昂成本。本文通过扩展传引用模型至分布式应用,创建了一种解耦控制流与数据流的新型编程范式,从而克服了这一障碍。我们描述了ProxyStore系统,该系统通过提供作为广域对象引用且支持即时解析的对象代理来实现该范式。该代理模型使数据生产者能够单向、透明且高效地向本地及远程消费者传输数据。我们通过合成基准测试及实际科学应用,在多种计算平台上验证了该模型的优势。