Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks. Premier-TACO leverages a subset of multitask offline datasets for pretraining a general feature representation, which captures critical environmental dynamics and is fine-tuned using minimal expert demonstrations. It advances the temporal action contrastive learning (TACO) objective, known for state-of-the-art results in visual control tasks, by incorporating a novel negative example sampling strategy. This strategy is crucial in significantly boosting TACO's computational efficiency, making large-scale multitask offline pretraining feasible. Our extensive empirical evaluation in a diverse set of continuous control benchmarks including Deepmind Control Suite, MetaWorld, and LIBERO demonstrate Premier-TACO's effectiveness in pretraining visual representations, significantly enhancing few-shot imitation learning of novel tasks. Our code, pretraining data, as well as pretrained model checkpoints will be released at https://github.com/PremierTACO/premier-taco. Our project webpage is at https://premiertaco.github.io.

翻译：我们提出了Premier-TACO，一种旨在提升序列决策任务中小样本策略学习效率的多任务特征表征学习方法。Premier-TACO利用多任务离线数据集的一个子集来预训练一个通用的特征表征，该表征捕捉了关键的环境动态，并可使用极少量的专家示范进行微调。该方法通过引入一种新颖的负例采样策略，改进了在视觉控制任务中已知能取得最先进结果的时序动作对比学习（TACO）目标。该策略对于显著提升TACO的计算效率至关重要，使得大规模多任务离线预训练变得可行。我们在包括Deepmind Control Suite、MetaWorld和LIBERO在内的多样化连续控制基准测试中进行的广泛实证评估表明，Premier-TACO在预训练视觉表征方面具有高效性，能显著增强新任务的小样本模仿学习能力。我们的代码、预训练数据以及预训练模型检查点将在 https://github.com/PremierTACO/premier-taco 发布。我们的项目网页位于 https://premiertaco.github.io。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日