Machine learning has shown tremendous potential for improving the capabilities of network traffic analysis applications, often outperforming simpler rule-based heuristics. However, ML-based solutions remain difficult to deploy in practice. Many existing approaches only optimize the predictive performance of their models, overlooking the practical challenges of running them against network traffic in real time. This is especially problematic in the domain of traffic analysis, where the efficiency of the serving pipeline is a critical factor in determining the usability of a model. In this work, we introduce CATO, a framework that addresses this problem by jointly optimizing the predictive performance and the associated systems costs of the serving pipeline. CATO leverages recent advances in multi-objective Bayesian optimization to efficiently identify Pareto-optimal configurations, and automatically compiles end-to-end optimized serving pipelines that can be deployed in real networks. Our evaluations show that compared to popular feature optimization techniques, CATO can provide up to 3600x lower inference latency and 3.7x higher zero-loss throughput while simultaneously achieving better model performance.
翻译:机器学习在提升网络流量分析应用能力方面展现出巨大潜力,其表现通常优于简单的基于规则的启发式方法。然而,基于机器学习的解决方案在实际部署中仍然面临困难。许多现有方法仅优化模型的预测性能,忽视了在实时网络流量中运行模型所面临的实际挑战。这在流量分析领域尤为突出,其中服务流水线的效率是决定模型可用性的关键因素。本研究提出CATO框架,通过联合优化服务流水线的预测性能与相关系统成本来解决此问题。CATO利用多目标贝叶斯优化的最新进展,高效识别帕累托最优配置,并自动编译可部署于真实网络的端到端优化服务流水线。评估结果表明,与主流特征优化技术相比,CATO在实现更优模型性能的同时,能够将推理延迟降低高达3600倍,并将零损失吞吐量提升3.7倍。