AI tools for programming are no longer just autocomplete or chat assistants: they organize themselves as development frameworks, with process, roles, artifacts and verification. Recent surveys map agents and LLMs for software engineering, but a study centered on the operational frameworks that turn these capabilities into process is missing. We ran a directed search of primary sources, with a functional inclusion criterion and traction measurement, and selected six frameworks: GitHub Spec Kit, OpenSpec, BMAD Method, Get Shit Done (GSD), Spec Kitty and Reversa. Each attacks AI development through a different path: spec-driven development in full and lightweight variants, agent-driven agile planning, context engineering over the agent, worktree isolation and review, and recovery of operational specifications from legacy systems. Our central contribution is a six-dimension process taxonomy: specification, context, roles, execution, validation and portability, with a scoring rubric that turns it into a replicable instrument. We apply it to the six frameworks and an out-of-sample case, Spec-Flow. Two results stand out. Among frameworks that already adopt some process there is convergence: the isolated prompt loses centrality, and persistent artifacts, work contracts, traceability and human review become mechanisms that reduce ambiguity and coordinate agents. And no framework strongly covers all six dimensions, exposing a structural trade-off between process depth and portability across agents. We also found recurring risks: drift between specification and code, excessive trust in generated artifacts, fragility of community extensions, platform dependence and a lack of benchmarks for the complete process. We close with a research agenda for empirical evaluation, focused on intermediate-quality metrics, context governance, installation security and reproducibility.
翻译:人工智能编程工具已不再是单纯的自动补全或聊天助手:它们已组织为开发框架,具备流程、角色、工件和验证机制。近期研究对软件工程中的智能体和大型语言模型进行了综述,但缺乏聚焦于将这些能力转化为流程的操作性框架的研究。我们通过定向搜索原始文献,采用功能性纳入标准和影响力度量,最终选定六个框架:GitHub Spec Kit、OpenSpec、BMAD Method、Get Shit Done (GSD)、Spec Kitty 和 Reversa。每个框架通过不同路径解决AI开发问题:完整版和轻量版的规范驱动开发、智能体驱动的敏捷规划、面向智能体的上下文工程、工作树隔离与评审,以及从遗留系统恢复操作规范。我们的核心贡献在于提出一个六维度流程分类体系:规范、上下文、角色、执行、验证与可移植性,并配套评分准则使其成为可复现的工具。我们将该体系应用于六个框架及一个样本外案例——Spec-Flow。研究凸显两个结论:在已采用流程的框架间存在趋同现象——孤立提示词失去核心地位,持久化工件、工作契约、可追溯性和人工评审成为降低歧义与协调智能体的关键机制;同时,尚无框架能全面覆盖所有六个维度,暴露出流程深度与跨智能体可移植性之间的结构性权衡。我们还发现常见风险:规范与代码之间的偏离、对生成工件的过度信任、社区扩展的脆弱性、平台依赖以及缺少完整流程的基准测试。最后,我们提出聚焦中间质量指标、上下文治理、安装安全性与可重现性的实证评估研究议程。