Joint-Embedding Self Supervised Learning (JE-SSL) has seen a rapid development, with the emergence of many method variations but only few principled guidelines that would help practitioners to successfully deploy them. The main reason for that pitfall comes from JE-SSL's core principle of not employing any input reconstruction therefore lacking visual cues of unsuccessful training. Adding non informative loss values to that, it becomes difficult to deploy SSL on a new dataset for which no labels can help to judge the quality of the learned representation. In this study, we develop a simple unsupervised criterion that is indicative of the quality of the learned JE-SSL representations: their effective rank. Albeit simple and computationally friendly, this method -- coined RankMe -- allows one to assess the performance of JE-SSL representations, even on different downstream datasets, without requiring any labels. A further benefit of RankMe is that it does not have any training or hyper-parameters to tune. Through thorough empirical experiments involving hundreds of training episodes, we demonstrate how RankMe can be used for hyperparameter selection with nearly no reduction in final performance compared to the current selection method that involve a dataset's labels. We hope that RankMe will facilitate the deployment of JE-SSL towards domains that do not have the opportunity to rely on labels for representations' quality assessment.
翻译:联合嵌入自监督学习(JE-SSL)近年发展迅速,涌现出大量方法变体,但缺乏能够帮助实践者成功部署这些方法的原则性指导准则。这一困境的主要原因在于JE-SSL的核心原则——避免任何输入重建,因此缺乏训练失败的视觉线索。加之非信息性损失值的存在,使得在无标签辅助判断所学表示质量的新数据集上部署SSL变得困难。本研究提出一种简单的无监督准则,用于指示所学JE-SSL表示的质量:有效秩。尽管该方法——命名为RankMe——简单且计算友好,但它能够评估JE-SSL表示的性能,即使在不同下游数据集上也无需任何标签。RankMe的另一优势在于无需训练或调整超参数。通过涉及数百次训练回合的充分实证实验,我们证明了RankMe可用于超参数选择,且相较于当前依赖数据集标签的选择方法,最终性能几乎无损。我们期望RankMe能推动JE-SSL在无法依赖标签评估表示质量的领域中的部署。