Keeping software systems up to date is essential to avoid technical debt, security vulnerabilities, and the rigidity typical of legacy systems. However, updating libraries and frameworks remains a time consuming and error-prone process. Recent advances in Large Language Models (LLMs) and agentic coding systems offer new opportunities for automating such maintenance tasks. In this paper, we evaluate the update of a well-known Python library, SQLAlchemy, across a dataset of ten client applications. For this task, we use the Github's Copilot Agent Mode, an autonomous AI systema capable of planning and executing multi-step migration workflows. To assess the effectiveness of the automated migration, we also introduce Migration Coverage, a metric that quantifies the proportion of API usage points correctly migrated. The results of our study show that the LLM agent was capable of migrating functionalities and API usages between SQLAlchemy versions (migration coverage: 100%, median), but failed to maintain the application functionality, leading to a low test-pass rate (39.75%, median).
翻译:保持软件系统处于最新状态对于避免技术债务、安全漏洞以及遗留系统典型的僵化问题至关重要。然而,更新库和框架仍然是一个耗时且容易出错的过程。大型语言模型(LLMs)和智能体编码系统的最新进展为自动化此类维护任务提供了新的机遇。本文中,我们评估了在一个包含十个客户端应用程序的数据集上,对知名Python库SQLAlchemy的更新。为此任务,我们使用了GitHub的Copilot代理模式,这是一个能够规划并执行多步骤迁移工作流的自主AI系统。为了评估自动化迁移的有效性,我们还引入了迁移覆盖率这一指标,用于量化正确迁移的API使用点比例。我们的研究结果表明,LLM代理能够迁移SQLAlchemy版本间的功能和API使用(迁移覆盖率中位数:100%),但未能保持应用程序功能,导致测试通过率较低(中位数:39.75%)。