Recently, SimCSE has shown the feasibility of contrastive learning in training sentence embeddings and illustrates its expressiveness in spanning an aligned and uniform embedding space. However, prior studies have shown that dense models could contain harmful parameters that affect the model performance, and it is no wonder that SimCSE can as well be invented with such parameters. Driven by this, parameter sparsification is applied, where alignment and uniformity scores are used to measure the contribution of each parameter to the overall quality of sentence embeddings. Drawing from a preliminary study, we consider parameters with minimal contributions to be detrimental, as their sparsification results in improved model performance. To discuss the ubiquity of detrimental parameters and remove them, more experiments on the standard semantic textual similarity (STS) tasks and transfer learning tasks are conducted, and the results show that the proposed sparsified SimCSE (SparseCSE) has excellent performance in comparison with SimCSE. Furthermore, through in-depth analysis, we establish the validity and stability of our sparsification method, showcasing that the embedding space generated by SparseCSE exhibits improved alignment compared to that produced by SimCSE. Importantly, the uniformity yet remains uncompromised.
翻译:近期,SimCSE展示了对比学习在训练句子嵌入中的可行性,并揭示了其在构建对齐且均匀的嵌入空间方面的表达能力。然而,先前研究表明密集模型可能包含影响模型性能的有害参数,SimCSE同样可能包含此类参数。基于此,我们引入参数稀疏化方法,利用对齐度与均匀度评分衡量各参数对句子嵌入整体质量的贡献。通过预实验发现,贡献最小的参数具有负面效应,其稀疏化能够提升模型性能。为探讨有害参数的普遍性并消除它们,我们在标准语义文本相似度任务及迁移学习任务上开展更多实验,结果表明所提出的稀疏化SimCSE(SparseCSE)相比原模型具有更优性能。此外,通过深入分析验证了稀疏化方法的有效性与稳定性,证明SparseCSE生成的嵌入空间较SimCSE具有更优的对齐性,且均匀性保持无损。