Modern Artificial Intelligence (AI) technologies, led by Machine Learning (ML), have gained unprecedented momentum over the past decade. Following this wave of ``AI summer'', the network research community has also embraced AI/ML algorithms to address many problems related to network operations and management. However, compared to their counterparts in other domains, most ML-based solutions have yet to receive large-scale deployment due to insufficient maturity for production settings. This article concentrates on the practical issues of developing and operating ML-based solutions in real networks. Specifically, we enumerate the key factors hindering the integration of AI/ML in real networks and review existing solutions to uncover the missing considerations. Further, we highlight a promising direction, i.e., Machine Learning Operations (MLOps), that can close the gap. We believe this paper spotlights the system-related considerations on implementing \& maintaining ML-based solutions and invigorate their full adoption in future networks.
翻译:现代人工智能(AI)技术以机器学习(ML)为主导,在过去十年间获得了前所未有的发展动力。在这场"AI盛夏"浪潮的推动下,网络研究社区也开始采用AI/ML算法来解决网络运营与管理领域的诸多问题。然而,与其他领域的同类方案相比,大多数基于ML的解决方案由于在生产环境中成熟度不足,尚未实现大规模部署。本文聚焦于在真实网络中开发与部署基于ML解决方案的实际问题。具体而言,我们逐一列举了阻碍AI/ML融入真实网络的关键因素,并系统梳理现有方案以揭示被忽视的考量维度。进一步地,我们指明了一个能够弥合差距的富有前景的方向——机器学习运维(MLOps)。我们相信,本文揭示了实现与维护基于ML解决方案时涉及的系统层面考量,并将推动其在未来网络中的全面采纳。