Software engineering agents (SWE agents) increasingly work through tool-mediated trajectories in real repositories, yet their behavior remains difficult to characterize in concrete, observable terms. These trajectories record tool use, intermediate reasoning, evidence selection, and self-directed stopping, but they do not by themselves explain why particular moves were chosen, what evidence was trusted, or when understanding was judged sufficient. This tension makes trajectory data both limited and valuable: faithful, replayable traces can become an empirical substrate for studying agent behavior when interpreted through disciplined observation. We introduce Ada, a scoped apparatus for repository-level code understanding. Ada enters real codebases through a bounded tool interface, allowing open-ended exploration to remain recordable as finite trajectories. Across this wild-but-bounded setting, Ada chooses where to look, what to read closely, when to consolidate partial understanding, and when to close its account of the repository. We project Ada's think-action chains through observation lenses that make navigation, evidence selection, synthesis, grounding, and stopping visible without reducing behavior to raw tool counts or speculating about hidden intent. Read together, these lenses produce behavioral profiles grounded in recorded movement through software worlds. Across 408 trajectories, spanning multiple models, repositories, task families, and launch conditions, the study shows how faithful digital traces can be transformed into disciplined, comparable projections of emerging SWE-agent mindset. The results expose differences in efficiency, trajectory diversity, epistemic grounding, and the limits of intervention, while providing a methodological foundation for observing SWE agent behavior in real codebases.
翻译:软件工程代理(SWE代理)日益通过真实仓库中的工具介导轨迹运作,但其行为仍难以用具体、可观测的术语加以描述。这些轨迹记录了工具使用、中间推理、证据选择及自主停止行为,但无法解释为何选择特定行动、信任哪些证据或何时判定理解充分。这种张力使轨迹数据既受限又有价值:经由严格观察解读后,忠实可重放的追踪记录可成为研究代理行为的实证基础。我们提出Ada——一种针对仓库级代码理解的有限作用装置。Ada通过受限工具接口进入真实代码库,使开放式探索得以在可记录的有限轨迹中进行。在此类“荒野但受限”设置下,Ada自主选择观察目标、精读内容、整合局部理解的时机,以及结束仓库描述的时刻。我们通过观测透镜投射Ada的思考-行动链条,使导航、证据选择、综合、锚定及停止行为可见,既不将其行为简化为原始工具计数,也不臆测隐藏意图。综合解读这些透镜,可生成基于软件世界移动记录的代理行为档案。基于涵盖多种模型、仓库、任务系列及启动条件的408条轨迹,本研究展示了如何将忠实数字痕迹转化为严谨可比的SWE新兴思维投射。结果揭示了效率、轨迹多样性、认知锚定及干预限制方面的差异,同时为观测真实代码库中的SWE代理行为提供了方法论基础。