GPU remoting is a promising technique for supporting AI applications. Networking plays a key role in enabling remoting. However, for efficient remoting, the network requirements in terms of latency and bandwidth are unknown. In this paper, we take a GPU-centric approach to derive the minimum latency and bandwidth requirements for GPU remoting, while ensuring no (or little) performance degradation for AI applications. Our study including theoretical model demonstrates that, with careful remoting design, unmodified AI applications can run on the remoting setup using commodity networking hardware without any overhead or even with better performance, with low network demands.
翻译:GPU远程调用技术是支撑人工智能应用的重要方法。网络技术在实现远程调用过程中发挥着关键作用。然而,要实现高效的远程调用,其在延迟和带宽方面的网络需求尚不明确。本文采用以GPU为核心的研究方法,推导出GPU远程调用的最低延迟和带宽需求,同时确保AI应用的性能不会(或几乎不会)降低。包含理论模型在内的研究表明,通过精心的远程调用设计,未经修改的AI应用可在使用商用网络硬件的远程调用环境中运行,不仅不会产生性能开销,甚至能获得更优性能,且对网络资源的需求较低。