With the rapid adoption of LLM-assisted coding, the need to manage the technical debt these systems introduce has become urgent. In this paper, we conduct a multivocal literature review of 104 sources (31 formal, 73 grey) to examine how LLM-assisted development contributes to technical debt and what strategies, metrics, and benchmarks exist to mitigate it. We find that LLMs often amplify traditional forms of technical debt, particularly code, design, and documentation debts, while also introducing new LLM-specific debts. Notably, we identify fast-integration debt, where rapidly generated code prioritizes speed over quality, triggering a domino effect that leads to governance debt and increased long-term maintenance costs. Additional emerging categories include prompt, ethical, data, and provenance debt, reflecting new challenges unique to LLM adoption. To address these, strategies suggested in the literature include human-in-the-loop frameworks, prompt engineering, and data quality alignment. In practice, tools such as SonarQube are commonly used to detect technical debt indicators, while research prototypes such as CodeSmellEval are emerging to assess how LLMs contribute to debts. However, no standardized benchmarks or LLM-specific metrics yet exist, leaving an important gap. Based on findings, we outline insights and future directions to ensure reliable integration of LLMs into software engineering workflows.
翻译:随着LLM辅助编码技术的迅速普及,管理这些系统所引入的技术债务变得刻不容缓。本文通过一项涵盖104篇文献(31篇正式文献,73篇灰色文献)的多源文献综述,考察了LLM辅助开发如何导致技术债务,以及现有缓解策略、指标和基准。我们发现,LLM往往会放大传统形式的技术债务,尤其是代码债务、设计债务和文档债务,同时也会引入新的LLM特有债务。值得注意的是,我们识别出“快速集成债务”,即快速生成的代码优先追求速度而非质量,从而引发多米诺效应,导致治理债务并增加长期维护成本。其他新兴类别包括提示债务、伦理债务、数据债务和溯源债务,这些反映了LLM应用所带来的独特新挑战。为解决这些问题,文献提出的策略包括人在回路框架、提示工程和数据质量对齐。在实践中,诸如SonarQube之类的工具常被用于检测技术债务指标,而诸如CodeSmellEval等研究原型则正在兴起,以评估LLM如何导致债务。然而,目前尚缺乏标准化的基准或LLM特定指标,这构成了一个重要的空白。基于研究发现,我们概述了相关见解和未来方向,以确保LLM可靠地集成到软件工程工作流程中。