Contrastive self-supervised learning has gained attention for its ability to create high-quality representations from large unlabelled data sets. A key reason that these powerful features enable data-efficient learning of downstream tasks is that they provide augmentation invariance, which is often a useful inductive bias. However, the amount and type of invariances preferred is not known apriori, and varies across different downstream tasks. We therefore propose a multi-task self-supervised framework (MT-SLVR) that learns both variant and invariant features in a parameter-efficient manner. Our multi-task representation provides a strong and flexible feature that benefits diverse downstream tasks. We evaluate our approach on few-shot classification tasks drawn from a variety of audio domains and demonstrate improved classification performance on all of them
翻译:对比自监督学习因其能从大规模无标注数据中生成高质量表示的能力而受到关注。这些强大特征能够实现下游任务的数据高效学习,一个关键原因在于它们提供了数据增强不变性,这通常是一种有用的归纳偏置。然而,偏好何种程度与类型的不变性无法先验获知,且在不同下游任务间存在差异。因此,我们提出一种多任务自监督学习框架(MT-SLVR),以参数高效的方式同时学习可变与不变特征。我们的多任务表示提供了强大而灵活的特征,能够促进各类下游任务。我们在多个音频领域的少样本分类任务上评估了该方法,并证明了其在所有任务上均实现了分类性能的提升。