We propose reCSE, a self supervised contrastive learning sentence representation framework based on feature reshaping. This framework is different from the current advanced models that use discrete data augmentation methods, but instead reshapes the input features of the original sentence, aggregates the global information of each token in the sentence, and alleviates the common problems of representation polarity and GPU memory consumption linear increase in current advanced models. In addition, our reCSE has achieved competitive performance in semantic similarity tasks. And the experiment proves that our proposed feature reshaping method has strong universality, which can be transplanted to other self supervised contrastive learning frameworks and enhance their representation ability, even achieving state-of-the-art performance. Our code is available at https://github.com/heavenhellchen/reCSE.
翻译:我们提出了reCSE,一种基于特征重塑的自监督对比学习句子表示框架。该框架不同于当前使用离散数据增强方法的先进模型,而是对原始句子的输入特征进行重塑,聚合句子中每个词元的全局信息,从而缓解当前先进模型中普遍存在的表示极性问题和GPU内存消耗线性增长问题。此外,我们的reCSE在语义相似性任务中取得了具有竞争力的性能。实验证明,我们提出的特征重塑方法具有很强的普适性,可以移植到其他自监督对比学习框架中并增强其表示能力,甚至达到最先进的性能。我们的代码可在 https://github.com/heavenhellchen/reCSE 获取。