The goal of generality in machine learning is to achieve excellent performance on various unseen tasks and domains. Recently, self-supervised learning (SSL) has been regarded as an effective method to achieve this goal. It can learn high-quality representations from unlabeled data and achieve promising empirical performance on multiple downstream tasks. Existing SSL methods mainly constrain generality from two aspects: (i) large-scale training data, and (ii) learning task-level shared knowledge. However, these methods lack explicit modeling of the SSL generality in the learning objective, and the theoretical understanding of SSL's generality remains limited. This may cause SSL models to overfit in data-scarce situations and generalize poorly in the real world, making it difficult to achieve true generality. To address these issues, we provide a theoretical definition of generality in SSL and define a $\sigma$-measurement to help quantify it. Based on this insight, we explicitly model generality into self-supervised learning and further propose a novel SSL framework, called GeSSL. It introduces a self-motivated target based on $\sigma$-measurement, which enables the model to find the optimal update direction towards generality. Extensive theoretical and empirical evaluations demonstrate the superior performance of the proposed GeSSL.
翻译:机器学习的通用性目标是在各种未见过的任务和领域上实现卓越性能。近年来,自监督学习(SSL)已被视为实现这一目标的有效方法。它能够从无标签数据中学习高质量的表示,并在多个下游任务上取得有前景的实验结果。现有的SSL方法主要从两个方面约束通用性:(i)大规模训练数据,以及(ii)学习任务级共享知识。然而,这些方法缺乏对学习目标中SSL通用性的显式建模,且对SSL通用性的理论理解仍然有限。这可能导致SSL模型在数据稀缺情况下过拟合,并在现实世界中泛化能力差,难以实现真正的通用性。为解决这些问题,我们提供了SSL中通用性的理论定义,并定义了一个σ度量来帮助量化它。基于这一见解,我们明确地将通用性建模到自监督学习中,并进一步提出了一种新颖的SSL框架,称为GeSSL。它引入了一个基于σ度量的自我激励目标,使模型能够找到朝向通用性的最优更新方向。广泛的理论和实证评估证明了所提出的GeSSL的优越性能。