Foundation Models for Decision Making: Problems, Methods, and Opportunities

Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks. When such models are deployed in real world environments, they inevitably interface with other entities and agents. For example, language models are often used to interact with human beings through dialogue, and visual perception models are used to autonomously navigate neighborhood streets. In response to these developments, new paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning. These paradigms leverage the existence of ever-larger datasets curated for multimodal, multitask, and generalist interaction. Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems that can interact effectively across a diverse range of applications such as dialogue, autonomous driving, healthcare, education, and robotics. In this manuscript, we examine the scope of foundation models for decision making, and provide conceptual tools and technical background for understanding the problem space and exploring new research directions. We review recent approaches that ground foundation models in practical decision making applications through a variety of methods such as prompting, conditional generative modeling, planning, optimal control, and reinforcement learning, and discuss common challenges and open problems in the field.

翻译：在多样化大规模数据上预训练的基础模型已展现出在广泛视觉与语言任务中的非凡能力。当此类模型部署于真实环境时，它们不可避免地需要与其他实体和智能体交互。例如，语言模型常通过对话与人类交互，视觉感知模型则用于自主导航街区。针对这些发展，训练基础模型以与其他智能体交互并执行长期推理的新范式正在兴起。这些范式利用日益庞大的多模态、多任务及通用交互数据集。基础模型与决策领域的交叉研究有望催生强大新系统，使其能在对话、自动驾驶、医疗、教育及机器人等多样化应用中有效交互。本文审视了面向决策的基础模型范畴，提供概念工具与技术背景以理解问题空间并探索新研究方向。我们回顾了近期通过提示、条件生成建模、规划、最优控制及强化学习等方法将基础模型落地于实际决策应用的研究，并探讨了该领域的共性挑战与开放性问题。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日