Large-scale video conferencing services incur significant network cost while serving surging global demands. Our work systematically explores the opportunity to offload a fraction of this traffic to the Internet, a cheaper routing option offered already by cloud providers, from WAN without drop in application performance. First, with a large-scale latency measurement study with 3.5 million data points per day spanning 241K source cities and 21 data centers across the globe, we demonstrate that Internet paths perform comparable to or better than the private WAN for parts of the world (e.g., Europe and North America). Next, we present Titan, a live (12+ months) production system that carefully moves a fraction of the conferencing traffic to the Internet using the above observation. Finally, we propose Titan-Next, a research prototype that jointly assigns the conferencing server and routing option (Internet or WAN) for individual calls. With 5 weeks of production data, we show Titan-Next reduces the sum of peak bandwidth on WAN links that defines the operational network cost by up to 61% compared to state-of-the-art baselines. We will open-source parts of the measurement data.
翻译:大规模视频会议服务在满足全球激增需求的同时,产生了高昂的网络成本。本研究系统探讨了在不降低应用性能的前提下,将部分流量从广域网卸载至互联网(云服务商已提供的成本更低的路由选项)的可行性。首先,通过一项大规模延迟测量研究(每日覆盖24.1万个源城市与全球21个数据中心,采集350万个数据点),我们证明对于世界部分地区(如欧洲和北美),互联网路径的性能与私有广域网相当甚至更优。接着,我们提出Titan——一个基于上述观测、已稳定运行12个月以上的生产系统,能够谨慎地将部分会议流量迁移至互联网。最后,我们设计了研究原型Titan-Next,可为单个通话联合分配会议服务器与路由选项(互联网或广域网)。基于5周的生产数据,我们证明相较于最先进的基线方法,Titan-Next能将决定运营网络成本的广域网链路峰值带宽总和降低高达61%。我们将开源部分测量数据。