The Transformer-based model have made significant strides in semantic matching tasks by capturing connections between phrase pairs. However, to assess the relevance of sentence pairs, it is insufficient to just examine the general similarity between the sentences. It is crucial to also consider the tiny subtleties that differentiate them from each other. Regrettably, attention softmax operations in transformers tend to miss these subtle differences. To this end, in this work, we propose a novel semantic sentence matching model named Combined Attention Network based on Transformer model (Comateformer). In Comateformer model, we design a novel transformer-based quasi-attention mechanism with compositional properties. Unlike traditional attention mechanisms that merely adjust the weights of input tokens, our proposed method learns how to combine, subtract, or resize specific vectors when building a representation. Moreover, our proposed approach builds on the intuition of similarity and dissimilarity (negative affinity) when calculating dual affinity scores. This allows for a more meaningful representation of relationships between sentences. To evaluate the performance of our proposed model, we conducted extensive experiments on ten public real-world datasets and robustness testing. Experimental results show that our method achieves consistent improvements.
翻译:基于Transformer的模型通过捕捉短语对之间的关联,在语义匹配任务中取得了显著进展。然而,要评估句子对的相关性,仅考察句子间的整体相似性是不够的。还必须考虑区分它们的细微语义差异。遗憾的是,Transformer中的注意力softmax操作往往忽略这些细微差别。为此,本研究提出了一种新颖的语义句子匹配模型——基于Transformer的组合注意力网络(Comateformer)。在Comateformer模型中,我们设计了一种具有组合特性的新型类注意力机制。与仅调整输入词元权重的传统注意力机制不同,我们提出的方法在构建表征时学习如何组合、消减或缩放特定向量。此外,该方法基于相似性与相异性(负亲和性)的直觉计算双重亲和度分数,从而能更有效地表征句子间关系。为评估模型性能,我们在十个公开真实数据集上进行了大量实验与鲁棒性测试。实验结果表明,该方法取得了持续性的性能提升。