Learning good image representations that are beneficial to downstream tasks is a challenging task in computer vision. As such, a wide variety of self-supervised learning approaches have been proposed. Among them, contrastive learning has shown competitive performance on several benchmark datasets. The embeddings of contrastive learning are arranged on a hypersphere that results in using the inner (dot) product as a distance measurement in Euclidean space. However, the underlying structure of many scientific fields like social networks, brain imaging, and computer graphics data exhibit highly non-Euclidean latent geometry. We propose a novel contrastive learning framework to learn semantic relationships in the hyperbolic space. Hyperbolic space is a continuous version of trees that naturally owns the ability to model hierarchical structures and is thus beneficial for efficient contrastive representation learning. We also extend the proposed Hyperbolic Contrastive Learning (HCL) to the supervised domain and studied the adversarial robustness of HCL. The comprehensive experiments show that our proposed method achieves better results on self-supervised pretraining, supervised classification, and higher robust accuracy than baseline methods.
翻译:学习对下游任务有益的优质图像表征是计算机视觉中的一项挑战性任务。为此,研究者提出了多种多样的自监督学习方法,其中对比学习在多个基准数据集上展现出具有竞争力的性能。传统对比学习的嵌入向量排列在超球面上,依赖欧几里得空间中的内积(点积)作为距离度量。然而,社交网络、脑成像和计算机图形学数据等许多科学领域的底层结构往往表现出高度非欧几里得的潜在几何特征。我们提出了一种新颖的对比学习框架,用于在双曲空间中学习语义关系。双曲空间作为树的连续化版本,天然具备建模层级结构的能力,因此有助于高效的对比表征学习。我们还将所提出的双曲对比学习(HCL)扩展到监督学习领域,并研究了HCL的对抗鲁棒性。综合实验表明,与基线方法相比,我们的方法在自监督预训练、监督分类任务中取得了更优结果,并实现了更高的鲁棒准确率。