Advances in networks, accelerators, and cloud services encourage programmers to reconsider where to compute -- such as when fast networks make it cost-effective to compute on remote accelerators despite added latency. Workflow and cloud-hosted serverless computing frameworks can manage multi-step computations spanning federated collections of cloud, high-performance computing (HPC), and edge systems, but passing data among computational steps via cloud storage can incur high costs. Here, we overcome this obstacle with a new programming paradigm that decouples control flow from data flow by extending the pass-by-reference model to distributed applications. We describe ProxyStore, a system that implements this paradigm by providing object proxies that act as wide-area object references with just-in-time resolution. This proxy model enables data producers to communicate data unilaterally, transparently, and efficiently to both local and remote consumers. We demonstrate the benefits of this model with synthetic benchmarks and real-world scientific applications, running across various computing platforms.
翻译:网络、加速器和云服务的进步激励程序员重新考虑计算的位置——例如,当快速网络使远程加速器上的计算在额外延迟下仍具成本效益时。工作流和云托管无服务器计算框架可以管理跨越云、高性能计算(HPC)和边缘系统联邦集合的多步计算,但通过云存储在计算步骤间传递数据可能产生高昂成本。在此,我们通过一种新的编程范式克服了这一障碍,该范式通过将传引用模型扩展到分布式应用,将控制流与数据流解耦。我们描述了ProxyStore——一个通过提供对象代理(作为具有即时解析能力的广域对象引用)来实现该范式的系统。这种代理模型使数据生产者能够透明、高效地将数据单方面传递给本地和远程消费者。我们通过合成基准测试和实际科学应用,在多种计算平台上展示了该模型的优势。