Deep Learning techniques have excelled at generating embedding spaces that capture semantic similarities between items. Often these representations are paired, enabling experiments with analogies (pairs within the same domain) and cross-modality (pairs across domains). These experiments are based on specific assumptions about the geometry of embedding spaces, which allow finding paired items by extrapolating the positional relationships between embedding pairs in the training dataset, allowing for tasks such as finding new analogies, and multimodal zero-shot classification. In this work, we propose a metric to evaluate the similarity between paired item representations. Our proposal is built from the structural similarity between the nearest-neighbors induced graphs of each representation, and can be configured to compare spaces based on different distance metrics and on different neighborhood sizes. We demonstrate that our proposal can be used to identify similar structures at different scales, which is hard to achieve with kernel methods such as Centered Kernel Alignment (CKA). We further illustrate our method with two case studies: an analogy task using GloVe embeddings, and zero-shot classification in the CIFAR-100 dataset using CLIP embeddings. Our results show that accuracy in both analogy and zero-shot classification tasks correlates with the embedding similarity. These findings can help explain performance differences in these tasks, and may lead to improved design of paired-embedding models in the future.
翻译:深度学习技术在生成能够捕捉项目间语义相似性的嵌入空间方面表现出色。这些表示通常成对出现,使得类比(同一领域内的配对)和跨模态(跨领域配对)实验成为可能。这些实验基于对嵌入空间几何结构的特定假设,通过外推训练数据集中嵌入对之间的位置关系来寻找配对项目,从而实现诸如发现新类比和多模态零样本分类等任务。在本研究中,我们提出了一种评估成对项目表示之间相似性的度量方法。我们的方案基于各表示的最近邻诱导图之间的结构相似性构建,可通过配置不同距离度量和不同邻域规模来比较空间。我们证明该方案可用于识别不同尺度下的相似结构,这是中心核对齐等核方法难以实现的。我们通过两个案例进一步说明该方法:使用GloVe嵌入的类比任务,以及使用CLIP嵌入在CIFAR-100数据集上的零样本分类。实验结果表明,类比任务和零样本分类任务的准确率均与嵌入相似性相关。这些发现有助于解释这些任务中的性能差异,并可能为未来成对嵌入模型的改进设计提供指导。