AI Agents have rapidly gained prominence in both research and industry as systems that extend large language models with planning, tool use, memory, and goal-directed action. Despite this progress, the development and maintenance of Agent systems present recurring engineering difficulties that are not yet well characterized in developer-facing evidence. To address this gap, this study analyzes developer discussions on Stack Overflow and failure reports from GitHub issue trackers associated with widely used Agent frameworks. For Stack Overflow, an Agent-focused corpus is constructed through tag expansion and filtering, latent themes are derived using LDA-MALLET, and topics are manually validated and labeled. For GitHub, a taxonomy of issue themes is developed to capture deployment-time failures and maintenance burdens. Analysis across both platforms identifies seven Stack Overflow topics (comprising 28 subtopics) and thirteen GitHub issue topics, which are synthesized into five overarching families of major Agent challenges: (1) environment, platforms, and dependency management; (2) retrieval, embeddings, and Agent memory; (3) orchestration and execution control; (4) interaction contracts between models and tools; and (5) runtime reliability and operational robustness. Topic popularity and difficulty are quantified, revealing that widely discussed issues, such as installation and prompting, are often resolved more quickly, whereas retrieval- and orchestration-related challenges are less visible, more complex, and tend to persist as ongoing maintenance burdens on GitHub.
翻译:AI Agent作为扩展大语言模型规划、工具使用、记忆及目标导向行动能力的系统,已在研究与工业领域迅速崛起。尽管进展显著,Agent系统的开发与维护仍存在尚未在开发者实证中得到充分刻画的重复性工程难题。为填补这一空白,本研究分析了Stack Overflow上的开发者讨论及主流Agent框架相关GitHub问题追踪器中的故障报告。针对Stack Overflow,通过标签扩展与筛选构建了Agent专题语料库,采用LDA-MALLET提取潜在主题,并对主题进行人工验证与标注。针对GitHub,建立了问题主题分类体系以捕捉部署期故障与维护负担。跨平台分析识别出七个Stack Overflow主题(含28个子主题)与十三个GitHub问题主题,最终综合为五大核心Agent挑战类别:(1) 环境、平台与依赖管理;(2) 检索、嵌入与Agent记忆;(3) 编排与执行控制;(4) 模型与工具间的交互契约;(5) 运行时可靠性与操作鲁棒性。通过量化主题热度与解决难度发现:广泛讨论的问题(如安装与提示工程)通常更快得到解决,而检索与编排相关挑战则可见度较低、复杂度更高,且在GitHub上往往持续成为长期维护负担。