In this paper, we propose a novel self-supervised transfer learning method called Distribution Matching (DM), which drives the representation distribution toward a predefined reference distribution while preserving augmentation invariance. The design of DM results in a learned representation space that is intuitively structured and offers easily interpretable hyperparameters. Experimental results across multiple real-world datasets and evaluation metrics demonstrate that DM performs competitively on target classification tasks compared to existing self-supervised transfer learning methods. Additionally, we provide robust theoretical guarantees for DM, including a population theorem and an end-to-end sample theorem. The population theorem bridges the gap between the self-supervised learning task and target classification accuracy, while the sample theorem shows that, even with a limited number of samples from the target domain, DM can deliver exceptional classification performance, provided the unlabeled sample size is sufficiently large.
翻译:本文提出了一种新颖的自监督迁移学习方法——分布匹配(DM),该方法在保持增强不变性的同时,将表示分布驱动至预定义的参考分布。DM的设计使得学习到的表示空间具有直观的结构,并提供了易于解释的超参数。在多个真实数据集和评估指标上的实验结果表明,与现有的自监督迁移学习方法相比,DM在目标分类任务上表现出竞争力。此外,我们为DM提供了坚实的理论保证,包括总体定理和端到端样本定理。总体定理弥合了自监督学习任务与目标分类精度之间的差距,而样本定理表明,即使目标域样本数量有限,只要未标记样本规模足够大,DM仍能提供优异的分类性能。