Traffic Engineering (TE) is critical for improving network performance and reliability. A key challenge in TE is the management of sudden traffic bursts. Existing TE schemes often struggle to accurately determine the extent of focus required for these surges, thereby facing difficulties in achieving a balance between performance under normal and peak traffic conditions. To address this issue, we introduce FIGRET, a Fine-Grained Robustness-Enhanced TE Scheme. FIGRET offers a novel approach to TE by providing varying levels of robustness enhancements, customized according to the distinct traffic characteristics of various source-destination pairs. By leveraging a sophisticated loss function and advanced deep learning techniques, FIGRET is capable of generating high-quality TE solutions efficiently. Our evaluations of real-world production networks, including Wide Area Networks and data centers, demonstrate that FIGRET significantly outperforms existing TE schemes. Compared to the TE scheme currently deployed in the Jupiter network of Google, FIGRET achieves a 9\%-34\% reduction in average Maximum Link Utilization and improves solution speed by $35\times$-$1800 \times$. Against DOTE, a state-of-the-art deep learning-based TE method, FIGRET substantially lowers the occurrence of significant congestion events triggered by traffic bursts by 41\%-53.9\% in topologies characterized by high traffic dynamics.
翻译:流量工程(TE)对于提升网络性能与可靠性至关重要,其核心挑战在于应对突发流量激增。现有TE方案常难以精确判断激增流量所需的关注程度,从而在常规与峰值流量条件下的性能平衡方面面临困境。为解决此问题,我们提出FIGRET——一种细粒度鲁棒性增强的TE方案。FIGRET通过根据源-目的对的不同流量特征,提供定制化的多层级鲁棒性增强,为TE提供了创新方法。借助精心设计的损失函数与先进深度学习技术,FIGRET能够高效生成高质量TE解决方案。我们在广域网和数据中心等真实生产网络中的评估表明,FIGRET显著优于现有TE方案。与谷歌Jupiter网络当前部署的TE方案相比,FIGRET将平均最大链路利用率降低9%-34%,并将求解速度提升35倍至1800倍。相较于当前最先进的基于深度学习的TE方法DOTE,在高动态流量拓扑场景中,FIGRET将流量突发引发的显著拥塞事件发生频率降低41%-53.9%。