Keeping software systems up to date is essential to avoid technical debt, security vulnerabilities, and the rigidity typical of legacy systems. However, updating libraries and frameworks remains a time consuming and error-prone process. Recent advances in Large Language Models (LLMs) and agentic coding systems offer new opportunities for automating such maintenance tasks. In this paper, we evaluate the update of a well-known Python library, SQLAlchemy, across a dataset of ten client applications. For this task, we use the Github's Copilot Agent Mode, an autonomous AI systema capable of planning and executing multi-step migration workflows. To assess the effectiveness of the automated migration, we also introduce Migration Coverage, a metric that quantifies the proportion of API usage points correctly migrated. The results of our study show that the LLM agent was capable of migrating functionalities and API usages between SQLAlchemy versions (migration coverage: 100%, median), but failed to maintain the application functionality, leading to a low test-pass rate (39.75%, median).
翻译:保持软件系统更新对于避免技术债务、安全漏洞以及遗留系统典型的僵化问题至关重要。然而,更新库和框架仍然是一个耗时且易出错的过程。大型语言模型(LLMs)和代理式编码系统的最新进展为自动化此类维护任务提供了新的机遇。本文中,我们评估了在十个客户端应用数据集上对知名Python库SQLAlchemy的更新。为此任务,我们使用了GitHub的Copilot代理模式,这是一种能够规划并执行多步骤迁移工作流的自主人工智能系统。为评估自动化迁移的有效性,我们还引入了迁移覆盖率这一指标,用于量化正确迁移的API使用点比例。研究结果表明,LLM代理能够迁移SQLAlchemy版本间的功能和API使用(迁移覆盖率中位数:100%),但未能保持应用功能,导致测试通过率较低(中位数:39.75%)。