Modern Artificial Intelligence (AI) technologies, led by Machine Learning (ML), have gained unprecedented momentum over the past decade. Following this wave of ``AI summer", the network research community has also embraced AI/ML algorithms to address many problems related to network operations and management. However, compared to their counterparts in other domains, most ML-based solutions have yet to receive large-scale deployment due to insufficient maturity for production settings. This paper concentrates on the practical issues of developing and operating ML-based solutions in real networks. Specifically, we enumerate the key factors hindering the integration of AI/ML in real networks and review existing solutions to uncover the missing considerations. We also highlight two potential directions, i.e., MLOps and Causal ML, that can close the gap. We believe this paper spotlights the system-related considerations on implementing \& maintaining ML-based solutions and invigorate their full adoption in future networks.
翻译:以机器学习(ML)为代表的现代人工智能(AI)技术在过去十年间取得了前所未有的发展。在此"AI盛夏"浪潮的推动下,网络研究领域也开始广泛采用AI/ML算法来解决网络运维与管理相关的诸多问题。然而,与其他领域的同类方案相比,多数基于ML的解决方案因在生产场景下成熟度不足而尚未获得大规模部署。本文聚焦于在实际网络中开发与部署ML解决方案的实践问题。具体而言,我们系统梳理了阻碍AI/ML真正融入实际网络的关键因素,并回顾现有方案以揭示被忽视的考量维度。同时我们重点指出两个有望弥合鸿沟的发展方向——MLOps与因果机器学习。本文旨在揭示实现与维护ML解决方案时需关注的系统层面考量,从而推动其在未来网络中的全面应用。