软件工程智能体在具身控制器生成中的应用：基于Minigrid环境的研究 (Software Engineering Agents for Embodied Controller Generation : A Study in Minigrid Environments)

Software Engineering Agents (SWE-Agents) have proven effective for traditional software engineering tasks with accessible codebases, but their performance for embodied tasks requiring well-designed information discovery remains unexplored. We present the first extended evaluation of SWE-Agents on controller generation for embodied tasks, adapting Mini-SWE-Agent (MSWEA) to solve 20 diverse embodied tasks from the Minigrid environment. Our experiments compare agent performance across different information access conditions: with and without environment source code access, and with varying capabilities for interactive exploration. We quantify how different information access levels affect SWE-Agent performance for embodied tasks and analyze the relative importance of static code analysis versus dynamic exploration for task solving. This work establishes controller generation for embodied tasks as a crucial evaluation domain for SWE-Agents and provides baseline results for future research in efficient reasoning systems.

翻译：软件工程智能体（SWE-Agents）在代码库可访问的传统软件工程任务中已证明其有效性，但其在需要精心设计信息发现的具身任务中的性能尚未得到探索。本文首次对SWE-Agents在具身任务控制器生成方面进行了系统性评估，通过适配Mini-SWE-Agent（MSWEA）来解决Minigrid环境中的20个多样化具身任务。实验比较了智能体在不同信息访问条件下的性能：包括有无环境源代码访问权限，以及具备不同交互探索能力的情况。我们量化了不同信息访问层级对SWE-Agents处理具身任务性能的影响，并分析了静态代码分析与动态探索在任务解决中的相对重要性。本研究确立了具身任务控制器生成作为SWE-Agents关键评估领域的地位，并为未来高效推理系统的研究提供了基准结果。

相关内容

Engineering

关注 6

《工程》是中国工程院（CAE）于2015年推出的国际开放存取期刊。其目的是提供一个高水平的平台，传播和分享工程研发的前沿进展、当前主要研究成果和关键成果；报告工程科学的进展，讨论工程发展的热点、兴趣领域、挑战和前景，在工程中考虑人与环境的福祉和伦理道德，鼓励具有深远经济和社会意义的工程突破和创新，使之达到国际先进水平，成为新的生产力，从而改变世界，造福人类，创造新的未来。期刊链接：https://www.sciencedirect.com/journal/engineering

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日