Context Engineering for AI Agents in Open-Source Software

GenAI-based coding assistants have disrupted software development. The next generation of these tools is agent-based, operating with more autonomy and potentially without human oversight. Like human developers, AI agents require contextual information to develop solutions that are in line with the standards, policies, and workflows of the software projects they operate in. Vendors of popular agentic tools (e.g., Claude Code) recommend maintaining version-controlled Markdown files that describe aspects such as the project structure, code style, or building and testing. The content of these files is then automatically added to each prompt. Recently, AGENTS$.$md has emerged as a potential standard that consolidates existing tool-specific formats. However, little is known about whether and how developers adopt this format. Therefore, in this paper, we present the results of a preliminary study investigating the adoption of AI context files in 466 open-source software projects. We analyze the information that developers provide in AGENTS$.$md files, how they present that information, and how the files evolve over time. Our findings indicate that there is no established content structure yet and that there is a lot of variation in terms of how context is provided (descriptive, prescriptive, prohibitive, explanatory, conditional). Our commit-level analysis provides first insights into the evolution of the provided context. AI context files provide a unique opportunity to study real-world context engineering. In particular, we see great potential in studying which structural or presentational modifications can positively affect the quality of the generated content.

翻译：基于生成式人工智能的编码助手已经颠覆了软件开发。这些工具的下一代是基于智能体的，它们以更高的自主性运行，并且可能在没有人工监督的情况下工作。与人类开发者类似，AI智能体需要上下文信息来开发符合其所在软件项目的标准、政策和工作流程的解决方案。流行的智能体工具（例如Claude Code）的供应商建议维护版本控制的Markdown文件，这些文件描述诸如项目结构、代码风格、构建和测试等方面。这些文件的内容随后会自动添加到每个提示中。最近，AGENTS$.$md作为一种潜在的标准出现，它整合了现有的工具特定格式。然而，关于开发者是否以及如何采用这种格式，目前知之甚少。因此，在本文中，我们展示了一项初步研究的结果，该研究调查了466个开源软件项目中AI上下文文件的采用情况。我们分析了开发者在AGENTS$.$md文件中提供的信息、他们呈现这些信息的方式，以及这些文件随时间的演变。我们的研究结果表明，目前尚未建立固定的内容结构，并且在提供上下文的方式（描述性、规定性、禁止性、解释性、条件性）上存在很大差异。我们的提交级别分析首次揭示了所提供上下文的演变过程。AI上下文文件为研究现实世界中的上下文工程提供了一个独特的机会。特别是，我们看到了研究哪些结构或呈现方式的修改能够对生成内容的质量产生积极影响的巨大潜力。