Representation-based Siamese networks have risen to popularity in lightweight text matching due to their low deployment and inference costs. While word-level attention mechanisms have been implemented within Siamese networks to improve performance, we propose Feature Attention (FA), a novel downstream block designed to enrich the modeling of dependencies among embedding features. Employing "squeeze-and-excitation" techniques, the FA block dynamically adjusts the emphasis on individual features, enabling the network to concentrate more on features that significantly contribute to the final classification. Building upon FA, we introduce a dynamic "selection" mechanism called Selective Feature Attention (SFA), which leverages a stacked BiGRU Inception structure. The SFA block facilitates multi-scale semantic extraction by traversing different stacked BiGRU layers, encouraging the network to selectively concentrate on semantic information and embedding features across varying levels of abstraction. Both the FA and SFA blocks offer a seamless integration capability with various Siamese networks, showcasing a plug-and-play characteristic. Experimental evaluations conducted across diverse text matching baselines and benchmarks underscore the indispensability of modeling feature attention and the superiority of the "selection" mechanism.
翻译:基于表示型的孪生网络因部署和推理成本低,在轻量级文本匹配中广受欢迎。尽管孪生网络已引入词级注意力机制来提升性能,我们提出特征注意力(FA)——一种新型下游模块,旨在增强嵌入特征间依赖关系的建模能力。该模块采用"挤压-激励"技术,动态调整各特征权重,使网络更关注对最终分类有显著贡献的特征。在FA基础上,我们进一步提出动态"选择"机制——选择性特征注意力(SFA),其核心采用堆叠式双向GRU(BiGRU)的Inception结构。SFA模块通过遍历不同堆叠BiGRU层实现多尺度语义提取,引导网络在不同抽象层级上选择性聚焦语义信息与嵌入特征。FA与SFA模块均可无缝集成至各类孪生网络,展现出即插即用的特性。在多种文本匹配基线与基准上的实验评估表明,建模特征注意力具有不可替代性,且"选择"机制具备显著优越性。