Advances in networks, accelerators, and cloud services encourage programmers to reconsider where to compute -- such as when fast networks make it cost-effective to compute on remote accelerators despite added latency. Workflow and cloud-hosted serverless computing frameworks can manage multi-step computations spanning federated collections of cloud, high-performance computing (HPC), and edge systems, but passing data among computational steps via cloud storage can incur high costs. Here, we overcome this obstacle with a new programming paradigm that decouples control flow from data flow by extending the pass-by-reference model to distributed applications. We describe ProxyStore, a system that implements this paradigm by providing object proxies that act as wide-area object references with just-in-time resolution. This proxy model enables data producers to communicate data unilaterally, transparently, and efficiently to both local and remote consumers. We demonstrate the benefits of this model with synthetic benchmarks and real-world scientific applications, running across various computing platforms.
翻译:网络、加速器和云服务的进步促使程序员重新考虑计算位置的选择——例如,当快速网络使得远程加速器上的计算(尽管存在额外延迟)具有成本效益时。工作流和云托管的无服务器计算框架能够管理跨联邦集合(涵盖云、高性能计算和边缘系统)的多步骤计算,但通过云存储在计算步骤间传递数据可能会产生高昂成本。本文通过一种新的编程范式克服了这一障碍,该范式将引用传递模型扩展到分布式应用中,从而将控制流与数据流解耦。我们描述了ProxyStore系统,该系统通过提供具有即时解析能力的广域对象引用的对象代理来实现这一范式。这种代理模型使得数据生产者能够单向、透明且高效地将数据传输给本地和远程消费者。我们通过合成基准测试和跨多种计算平台运行的真实世界科学应用,展示了该模型的优势。