Large Language Models (LLMs) have gained massive popularity in recent years and are increasingly integrated into software systems for diverse purposes. However, poorly integrating them in source code may undermine software system quality. Yet, to our knowledge, there is no formal catalog of code smells specific to coding practices for LLM inference. In this paper, we introduce the concept of LLM code smells and formalize five recurrent problematic coding practices related to LLM inference in software systems, based on relevant literature. We extend the detection tool SpecDetect4AI to cover the newly defined LLM code smells and use it to validate their prevalence in a dataset of 200 open-source LLM systems. Our results show that LLM code smells affect 60.50% of the analyzed systems, with a detection precision of 86.06%.
翻译:近年来,大型语言模型(LLMs)获得了广泛关注,并越来越多地集成到软件系统中以实现多样化目的。然而,在源代码中不当集成这些模型可能会损害软件系统的质量。然而,据我们所知,目前尚不存在专门针对LLM推理编码实践的正式代码异味分类目录。本文基于相关文献,提出了LLM代码异味的概念,并形式化了软件系统中与LLM推理相关的五种常见问题编码实践。我们扩展了检测工具SpecDetect4AI以覆盖新定义的LLM代码异味,并利用该工具在包含200个开源LLM系统的数据集中验证了这些异味的普遍性。我们的结果表明,LLM代码异味影响了60.50%的被分析系统,检测精度达到86.06%。