AI-powered web agents have the potential to automate repetitive tasks, such as form filling, information retrieval, and scheduling, but they struggle to reliably execute these tasks without human intervention, requiring users to provide detailed guidance during every run. We address this limitation by automatically synthesizing reusable workflows from an agent's successful and failed attempts. These workflows incorporate execution guards that help agents detect and fix errors while keeping users informed of progress and issues. Our approach enables agents to successfully complete repetitive tasks of the same type with minimal intervention, increasing the success rates from 24.2% to 70.1% across fifteen tasks. To evaluate this approach, we invited nine users and found that our agent helped them complete web tasks with a higher success rate and less guidance compared to two baseline methods, as well as allowed users to easily monitor agent behavior and understand failures.
翻译:基于人工智能的网络智能体有潜力自动化重复性任务,如表单填写、信息检索和日程安排,但若无人为干预,它们难以可靠地执行这些任务,需要用户在每次运行时提供详细指导。我们通过从智能体的成功与失败尝试中自动合成可复用工作流来解决这一局限。这些工作流包含执行守卫,可帮助智能体检测并修复错误,同时向用户同步进展与问题。我们的方法使智能体能够以最少干预成功完成同类重复任务,在十五项任务中将成功率从24.2%提升至70.1%。为评估该方法,我们邀请九位用户进行测试,发现相较于两种基线方法,我们的智能体以更高成功率和更少指导帮助用户完成网络任务,同时使用户能够轻松监控智能体行为并理解失败原因。