The recent success of Large Language Models (LLMs) signifies an impressive stride towards artificial general intelligence. They have shown a promising prospect in automatically completing tasks upon user instructions, functioning as brain-like coordinators. The associated risks will be revealed as we delegate an increasing number of tasks to machines for automated completion. A big question emerges: how can we make machines behave responsibly when helping humans automate tasks as personal copilots? In this paper, we explore this question in depth from the perspectives of feasibility, completeness and security. In specific, we present Responsible Task Automation (ResponsibleTA) as a fundamental framework to facilitate responsible collaboration between LLM-based coordinators and executors for task automation with three empowered capabilities: 1) predicting the feasibility of the commands for executors; 2) verifying the completeness of executors; 3) enhancing the security (e.g., the protection of users' privacy). We further propose and compare two paradigms for implementing the first two capabilities. One is to leverage the generic knowledge of LLMs themselves via prompt engineering while the other is to adopt domain-specific learnable models. Moreover, we introduce a local memory mechanism for achieving the third capability. We evaluate our proposed ResponsibleTA on UI task automation and hope it could bring more attentions to ensuring LLMs more responsible in diverse scenarios.
翻译:大型语言模型(LLMs)的最新成功标志着向通用人工智能迈出了令人瞩目的一步。它们在根据用户指令自动完成任务方面展现出令人期待的前景,充当着类似大脑的协调者角色。随着我们将越来越多的任务委托给机器自动完成,相关风险也将逐渐显现。一个关键问题由此产生:当机器作为个人助手协助人类自动化任务时,我们如何确保其行为负责任?本文从可行性、完整性和安全性三个维度深入探讨了这一问题。具体而言,我们提出了负责任任务自动化(ResponsibleTA)这一基础框架,以促进基于LLM的协调者与执行者之间的负责任协作,该框架具备三种增强能力:1)预测指令对执行者的可行性;2)验证执行者的完成任务完整性;3)提升安全性(例如保护用户隐私)。我们进一步提出并比较了实现前两种能力的两种范式:一种是通过提示工程利用LLM自身的通用知识,另一种是采用领域特定的可学习模型。此外,我们引入了局部记忆机制以实现第三种能力。我们在UI任务自动化场景下评估了所提出的ResponsibleTA框架,并希望这项工作能引发更多关注,以确保LLM在多样化场景中更加负责任。