Assistance to Autonomy: A Systematic Literature Review of Agentic AI across the Software Development Life Cycle

Agentic AI in software product development is increasingly adopted by organizations, yet the field lacks a consolidated synthesis of where adoption is mature, which architectural patterns dominate, and what limitations and coping mechanisms exist in industrial deployments. This systematic literature review addresses these gaps by establishing a body of knowledge as a starting point. Following Kitchenham guidelines, we queried four major research databases, obtaining over 1600 candidate publications. To handle this volume, we developed and validated a domain-agnostic multi-agent screening pipeline that extends prior LLM-assisted review tools by combining automatic metadata curation, inter-agent iterative dialogue, and conflict-resolution defaults that minimize false negatives. From the 92 manually verified primary studies, our thematic synthesis reveals that output verifiability is the primary enabler of agentic adoption: later SDLC phases, whose outputs are objectively evaluable through executable feedback, demonstrate the highest maturity and industrial presence, while earlier phases remain almost exclusively academic proofs-of-concept. We identify the Planner-Executor-Reviewer role specialization as the dominant architectural pattern, with the Reviewer agent implementing verifiability through executable feedback loops. Across all challenge categories, industrial mitigation strategies converge on confining agent actions to verifiable, bounded spaces. This study contributes a comprehensive characterization of the current literature on agentic systems in software product development, and a methodological contribution in the form of an AI-assisted tool to automate the screening phase in high-volume SLR domains.

翻译：智能体AI在软件产品开发中的应用正被组织广泛采纳，然而该领域缺乏对采纳成熟度、主导架构模式及工业部署中存在的局限性与应对机制的系统性综合研究。本系统性文献综述通过建立知识基础来填补这些空白。遵循Kitchenham方法指南，我们检索了四大主流研究数据库，获取超过1600篇候选文献。为处理这一规模的数据，我们开发并验证了一个领域无关的多智能体筛选流水线，该流水线通过结合自动元数据整理、智能体间迭代对话及最小化假阴性的冲突解决默认设置，扩展了既有的大语言模型辅助综述工具。从92篇经人工验证的主要研究中，我们的主题综合揭示：输出可验证性是智能体采纳的核心驱动力——软件开发生命周期后期阶段（其输出可通过可执行反馈进行客观评估）展现出最高的成熟度和工业应用占比，而前期阶段几乎仍停留在学术概念验证层面。我们识别出"规划者-执行者-审查者"角色分工为主导架构模式，其中审查者智能体通过可执行反馈循环实现可验证性。在所有挑战类别中，工业缓解策略均集中于将智能体行为约束在可验证的有限空间内。本研究不仅对软件产品开发中智能体系统的现有文献进行了全面特征化描述，还通过提出一种AI辅助工具的方法论贡献，实现了高容量系统性文献综述领域中筛选阶段的自动化。