We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks. Premier-TACO leverages a subset of multitask offline datasets for pretraining a general feature representation, which captures critical environmental dynamics and is fine-tuned using minimal expert demonstrations. It advances the temporal action contrastive learning (TACO) objective, known for state-of-the-art results in visual control tasks, by incorporating a novel negative example sampling strategy. This strategy is crucial in significantly boosting TACO's computational efficiency, making large-scale multitask offline pretraining feasible. Our extensive empirical evaluation in a diverse set of continuous control benchmarks including Deepmind Control Suite, MetaWorld, and LIBERO demonstrate Premier-TACO's effectiveness in pretraining visual representations, significantly enhancing few-shot imitation learning of novel tasks. Our code, pretraining data, as well as pretrained model checkpoints will be released at https://github.com/PremierTACO/premier-taco. Our project webpage is at https://premiertaco.github.io.
翻译:我们提出Premier-TACO,一种面向多任务特征表征学习的框架,旨在提升序列决策任务中少样本策略学习效率。该方法利用多任务离线数据集子集预训练通用特征表征,该表征可捕捉关键环境动态特性,并通过极少量专家示范进行微调。Premier-TACO改进了在视觉控制任务中取得最优结果的时序动作对比学习(TACO)目标函数,引入新颖的负样本采样策略。该策略对显著提升TACO计算效率至关重要,使大规模多任务离线预训练成为可能。我们在涵盖Deepmind Control Suite、MetaWorld及LIBERO等多类连续控制基准的广泛实证评估中证明,Premier-TACO在预训练视觉表征方面效果显著,可大幅增强新任务的少样本模仿学习能力。我们的代码、预训练数据及预训练模型检查点将在https://github.com/PremierTACO/premier-taco发布。项目网页为https://premiertaco.github.io。