Recent advancements in Artificial Intelligence, and particularly Large Language Models (LLMs), offer promising prospects for aiding system administrators in managing the complexity of modern networks. However, despite this potential, a significant gap exists in the literature regarding the extent to which LLMs can understand computer networks. Without empirical evidence, system administrators might rely on these models without assurance of their efficacy in performing network-related tasks accurately. In this paper, we are the first to conduct an exhaustive study on LLMs' comprehension of computer networks. We formulate several research questions to determine whether LLMs can provide correct answers when supplied with a network topology and questions on it. To assess them, we developed a thorough framework for evaluating LLMs' capabilities in various network-related tasks. We evaluate our framework on multiple computer networks employing private (e.g., GPT4) and open-source (e.g., Llama2) models. Our findings demonstrate promising results, with the best model achieving an average accuracy of 79.3%. Private LLMs achieve noteworthy results in small and medium networks, while challenges persist in comprehending complex network topologies, particularly for open-source models. Moreover, we provide insight into how prompt engineering can enhance the accuracy of some tasks.
翻译:近年来,人工智能特别是大型语言模型(LLMs)的进展为辅助系统管理员管理现代网络的复杂性提供了广阔前景。然而,尽管潜力巨大,现有文献在LLMs理解计算机网络的程度方面仍存在显著空白。缺乏经验证据的情况下,系统管理员可能依赖这些模型,却无法确保其准确执行网络相关任务的有效性。本文首次对LLMs理解计算机网络的能力进行了全面研究。我们提出了若干研究问题,以确定当LLMs被提供网络拓扑及相应问题时,能否给出正确答案。为评估这些能力,我们设计了一个全面的框架,用于评估LLMs在各种网络相关任务中的表现。我们通过多个计算机网络,采用私有模型(如GPT4)和开源模型(如Llama2)对该框架进行了评估。研究结果令人鼓舞,最佳模型平均准确率达到79.3%。私有LLMs在中小型网络中表现显著,但在理解复杂网络拓扑方面仍存在挑战,尤其对于开源模型。此外,我们揭示了提示工程如何提升某些任务的准确性。