Zero-Shot Adaptation of Behavioral Foundation Models to Unseen Dynamics

Behavioral Foundation Models (BFMs) proved successful in producing policies for arbitrary tasks in a zero-shot manner, requiring no test-time training or task-specific fine-tuning. Among the most promising BFMs are the ones that estimate the successor measure learned in an unsupervised way from task-agnostic offline data. However, these methods fail to react to changes in the dynamics, making them inefficient under partial observability or when the transition function changes. This hinders the applicability of BFMs in a real-world setting, e.g., in robotics, where the dynamics can unexpectedly change at test time. In this work, we demonstrate that Forward-Backward (FB) representation, one of the methods from the BFM family, cannot distinguish between distinct dynamics, leading to an interference among the latent directions, which parametrize different policies. To address this, we propose a FB model with a transformer-based belief estimator, which greatly facilitates zero-shot adaptation. We also show that partitioning the policy encoding space into dynamics-specific clusters, aligned with the context-embedding directions, yields additional gain in performance. These traits allow our method to respond to the dynamics observed during training and to generalize to unseen ones. Empirically, in the changing dynamics setting, our approach achieves up to a 2x higher zero-shot returns compared to the baselines for both discrete and continuous tasks.

翻译：行为基础模型（BFM）在零样本方式下为任意任务生成策略方面取得了成功，无需测试时训练或任务特定微调。其中最有前景的BFM是通过无监督方式从任务无关离线数据学习后继度量的模型。然而，这些方法无法响应动态变化，因此在部分可观测性条件下或转移函数发生变化时效率低下。这阻碍了BFM在现实场景（如机器人技术）中的适用性，因为在测试阶段动态可能意外改变。在本工作中，我们证明前向-后向（FB）表示（BFM家族方法之一）无法区分不同动态，导致参数化不同策略的潜在方向间产生干扰。为解决此问题，我们提出了一种基于Transformer的信念估计器增强的FB模型，极大促进了零样本适应。我们还展示将策略编码空间划分为动态特定聚类（与上下文嵌入方向对齐）可进一步提升性能。这些特性使我们的方法能够响应训练中观察到的动态并泛化至未知动态。实验表明，在变化动态场景下，我们的方法在离散和连续任务中的零样本回报比基线方法最高提升2倍。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

零样本量化：综述

专知会员服务

13+阅读 · 2025年5月15日

【NeurIPS2024】通过方差减少实现零样本模型的稳健微调

专知会员服务

19+阅读 · 2024年11月12日

【斯坦福博士论文】基础模型的数据分布视角，321页pdf

专知会员服务

42+阅读 · 2024年7月8日

【CVPR2023】基于图像特定提示学习的零样本生成模型自适应

专知会员服务

31+阅读 · 2023年4月7日