This paper investigates the differences in data organization between contrastive and supervised learning methods, focusing on the concept of locally dense clusters. We introduce a novel metric, Relative Local Density (RLD), to quantitatively measure local density within clusters. Visual examples are provided to highlight the distinctions between locally dense clusters and globally dense ones. By comparing the clusters formed by contrastive and supervised learning, we reveal that contrastive learning generates locally dense clusters without global density, while supervised learning creates clusters with both local and global density. We further explore the use of a Graph Convolutional Network (GCN) classifier as an alternative to linear classifiers for handling locally dense clusters. Finally, we utilize t-SNE visualizations to substantiate the differences between the features generated by contrastive and supervised learning methods. We conclude by proposing future research directions, including the development of efficient classifiers tailored to contrastive learning and the creation of innovative augmentation algorithms.
翻译:本文研究了对比学习与监督学习方法在数据组织上的差异,重点关注局部紧密簇的概念。我们提出一种新型度量指标——相对局部密度(Relative Local Density, RLD),用于定量测量簇内的局部密度。通过可视化示例,揭示了局部紧密簇与全局紧密簇之间的区别。对比分析两类方法形成的簇后发现:对比学习生成无全局密度的局部紧密簇,而监督学习则生成兼具局部与全局密度的簇。进一步,我们探讨了用图卷积网络(Graph Convolutional Network, GCN)分类器替代线性分类器处理局部紧密簇的方案。最后,利用t-SNE可视化验证了对比学习与监督学习生成特征之间的差异。本文结论提出未来研究方向,包括开发适配对比学习的高效分类器与构建创新性数据增广算法。